Both Google and IdentityServer have recently announced support for the PKCE (Proof Key for Code Exchange by OAuth Public Clients) specification defined by RFC 7636.
This is an excellent opportunity to revisit the OAuth 2.0 authorization code flow and illustrate how PKCE addresses some of the security issues that exist when this flow is implemented on native applications.
On the authorization code flow, the redirect from the authorization server back to client is one of the most security sensitive parts of the OAuth 2.0 protocol. The main reason is that this redirect contains the code representing the authorization delegation performed by the User. On public clients, such as native applications, this code is enough to obtain the access tokens allowing access to the User’s resources.
The PKCE specification addresses an attack vector where an attacker creates a native application that registers the same URL scheme used by the Client application, therefore gaining access to the authorization code. Succinctly, the PKCE specification requires the exchange of the code for the access token to use a ephemeral secret information that is not available on the redirect, making the knowledge of the code insufficient to use it. This extra information (or a transformation of it) is sent on the initial authorization request.
A slightly longer version
The OAuth 2.0 cast of characters
- The User is typically an human entity capable of granting access to resources.
- The Resource Server (RS) is the entity exposing an HTTP API to access these resources.
- The Client is an application (e.g. server-based Web application or native application) wanting to access these resources, via a authorization delegation performed by the User. Clients can be
- confidential – client applications that can hold a secret. The typical example are Web applications, where a client secret is stored and used only on the server side.
- public – client application that cannot hold a secret, such as native applications running on the User’s mobile device.
- The Authorization Server (AS) is the entity that authenticates the user, captures her authorization consent and issues access tokens that the Client application can use to access the resources exposed on the RS.
Authorization code flow for Web Applications
The following diagram illustrates the authorization code flow for Web applications (the Client application is a Web server).
- The flow starts with the Client application server-side producing a redirect HTTP response (e.g. response with 302 status) with the authorization request URL in the Location header. This URL will contain the authorization request parameters such as the state, scope and redirect_uri.
- When receiving this response, the User’s browser automatically performs a GET HTTP request to the Authorization Server (AS) authorization endpoint, containing the OAuth 2.0 authorization request.
- The AS then starts an interaction sequence to authenticate the user (e.g. username and password, two-factor authentication, delegated authentication), and to obtain the user consent. This sequence is not defined by OAuth 2.0 and can take multiple steps.
- After having authenticated and obtained consent from the user, the AS returns a HTTP redirect response with the authorization response on the Location header. This URL points to the client application hostname and contains the the authorization response parameters, such as the state and the (security sensitive) code.
- When receiving this response, the user’s browser automatically performs a GET request to the Client redirect endpoint with the OAuth 2.0 authorization response. By using HTTPS on the request to the Client, the protocol minimises the chances of the code being leaked to an attacker.
- Having received that authorization code, the Client then uses it to obtain the access token from the AS token endpoint. Since the client is a confidencial client, this request is authenticated with the client credentials (client ID and client secret), typically sent in the Authorization header using the basic scheme. The AS checks if this code is valid, namely if it was issued to the requesting authenticated client. If everything is verified, a 200 response with the access token is returned.
- Finally, the client can use the received access token to access the protected resources.
Authorization code flow for native Applications
For a native application, the flow is slightly different, namely on the first phase (the authorization request). Recall that in this case the Client application is running in the User’s device
- The flow begins with the Client application starting the system’s browser (or a web view, more on this on another post) at a URL with the authorization request. For instance, on the Android platform this is achieved by sending an intent.
- The browser comes into the foreground and performs a GET request to the AS authorization endpoint containing the authorization request.
- The same authentication and consent dance occurs between the AS and the User’s browser.
- After having authenticated and obtained consent from the user, the AS returns a HTTP redirect response with the authorization response on the Location header. This URL contains the the authorization response parameters. However, there is something special in the redirect URL. Instead of using a http URL scheme, which would make the browser perform another HTTP request, the redirect URL contains a custom URI scheme.
- As a result, when the browser receives this response and processes the redirect an inter-application message (e.g. an intent in Android) is sent to the application associated to this scheme, which should be the Client application. This brings the Client application to the foreground and provides it with the authorization response parameters, namely the authorization code.
- From now on, the flow is similar to the Web based one. Namely, the Client application uses the code to obtain the access token from the AS token endpoint. Since the client is a public client, this request is not authenticated, that is no client secret is used.
- Finally, having received the access token, the client application running on the device can access the User’s resources.
On both scenarios, the authorization code communication path, from the AS to the Client via User’s browser, is very security sensitive. This is specially relevant in the native scenario since the Client is public and the knowledge of that authorization code is enough to obtain the access token.
Hijacking the redirect
On the Web application scenario, the GET request with the authorization response has a HTTPS URL, which means that the browser will only send the code if the server correctly authenticates itself. However, on the native scenario, the intent will be sent to any installed application that registered the custom scheme. Unfortunately, there isn’t a central entity controlling and validating these scheme registrations, so an application can hijack the message from the browser to the client application, as shown in the following diagram.
Having obtained the authorization code, the attacker’s application has all the information required to retrieve a token and access the User’s resources.
The PKCE protection
The PKCE specification mitigates this vulnerability by requiring an extra code_verifier parameter on the exchange of the authorization code for the access token.
- On step 1, the Client application generates a random secret, stores it and uses its hash value on the new code_challenge authorization request parameter.
- On step 4, the AS somehow associates the returned code to the code_challenge.
- On step 6, the Client includes a code_verifier parameter with the secret on the token request message. The AS computes the hash of the code_verifier value and compares it with the original code_challenge associated with the code. Only if they are equals is the code accepted and an access token returned.
This ensures that only the entity that started the flow (sent the code_challenge on the authorization request) can end the flow and obtain the access token. By using a cryptographic hash function on the code_challenge, the protocol is protected from attackers that have read access to the original authorization request. However, the protocol also allows the secret to be used directly on the code_challenge.
Finally, the PKCE support by an AS can be advertised on the OAuth 2.0 or OpenID Connect discovery document, using the code_challenge_methods_supported field. The following is the Google’s OpenID Connect discovery document, located at https://accounts.google.com/.well-known/openid-configuration.
"code token id_token",