Section 4.2 of the draft OAuth 2.0 protocol indicates that an authorization server can return both an access_token
(which is used to authenticate oneself with a resource) as well as a refresh_token
, which is used purely to create a new access_token
:
https://tools.ietf.org/html/rfc6749#section-4.2
Why have both? Why not just make the access_token
last as long as the refresh_token
and not have a refresh_token
?
The link to discussion, provided by Catchdave, has another valid point made by Dick Hardt, which I believe is worth to be mentioned here in addition to what's been written above:
Indeed, in the situation where Resource Server and Authorization Server is the same entity, and where the connection between user and either of them is (usually) equally secure, there is not much sense to keep refresh token separate from the access token.
Although, as mentioned in the quote, another role of refresh tokens is to ensure the access token can be revoked at any time by the User (via the web-interface in their profiles, for example) while keeping the system scalable at the same time.
Generally, tokens can either be random identifiers pointing to the specific record in the Server's database, or they can contain all information in themselves (certainly, this information have to be signed, with MAC, for example).
How the system with long-lived access tokens should work
The server allows the Client to get access to User's data within a pre-defined set of scopes by issuing a token. As we want to keep the token revocable, we must store in the database the token along with the flag "revoked" being set or unset (otherwise, how would you do that with self-contained token?) Database can contain as much as
len(users) x len(registered clients) x len(scopes combination)
records. Every API request then must hit the database. Although it's quite trivial to make queries to such database performing O(1), the single point of failure itself can have negative impact on the scalability and performance of the system.How the system with long-lived refresh token and short-lived access token should work
Here we issue two keys: random refresh token with the corresponding record in the database, and signed self-contained access token, containing among others the expiration timestamp field.
As the access token is self-contained, we don't have to hit the database at all to check its validity. All we have to do is to decode the token and to validate the signature and the timestamp.
Nonetheless, we still have to keep the database of refresh tokens, but the number of requests to this database is generally defined by the lifespan of the access token (the longer the lifespan, the lower the access rate).
In order to revoke the access of Client from a particular User, we should mark the corresponding refresh token as "revoked" (or remove it completely) and stop issuing new access tokens. It's obvious though that there is a window during which the refresh token has been revoked, but its access token may still be valid.
Tradeoffs
Refresh tokens partially eliminate the SPoF (Single Point of Failure) of Access Token database, yet they have some obvious drawbacks.
The "window". A timeframe between events "user revokes the access" and "access is guaranteed to be revoked".
The complication of the Client logic.
without refresh token
with refresh token
I hope this answer does make sense and helps somebody to make more thoughtful decision. I'd like to note also that some well-known OAuth2 providers, including github and foursquare adopt protocol without refresh tokens, and seem happy with that.
This answer is from Justin Richer via the OAuth 2 standard body email list. This is posted with his permission.
The lifetime of a refresh token is up to the (AS) authorization server — they can expire, be revoked, etc. The difference between a refresh token and an access token is the audience: the refresh token only goes back to the authorization server, the access token goes to the (RS) resource server.
Also, just getting an access token doesn’t mean the user’s logged in. In fact, the user might not even be there anymore, which is actually the intended use case of the refresh token. Refreshing the access token will give you access to an API on the user’s behalf, it will not tell you if the user’s there.
OpenID Connect doesn’t just give you user information from an access token, it also gives you an ID token. This is a separate piece of data that’s directed at the client itself, not the AS or the RS. In OIDC, you should only consider someone actually “logged in” by the protocol if you can get a fresh ID token. Refreshing it is not likely to be enough.
For more information please read http://oauth.net/articles/authentication/
Let's consider a system where each user is linked to one or more roles and each role is linked to one or more access privileges. This information can be cached for better API performance. But then, there may be changes in the user and role configurations (for e.g. new access may be granted or current access may be revoked) and these should be reflected in the cache.
We can use access and refresh tokens for such purpose. When an API is invoked with access token, the resource server checks the cache for access rights. IF there is any new access grants, it is not reflected immediately. Once the access token expires (say in 30 minutes) and the client uses the refresh token to generate a new access token, the cache can be updated with the updated user access right information from the DB.
In other words, we can move the expensive operations from every API call using access tokens to the event of access token generation using refresh token.
While refresh token is retained by the Authorization server. Access token are self-contained so resource server can verify it without storing it which saves the effort of retrieval in case of validation. Another point missing in discussion is from rfc6749#page-55
I think the whole point of using refresh token is that even if attacker somehow manages to get refresh token, client ID and secret combination. With subsequent calls to get new access token from attacker can be tracked in case if every request for refresh result in new access token and refresh token.
To clear up some confusion you have to understand the roles of the client secret and the user password, which are very different.
The client is an app/website/program/..., backed by a server, that wants to authenticate a user by using a third-party authentication service. The client secret is a (random) string that is known to both this client and the authentication server. Using this secret the client can identify itself with the authentication server, receiving authorization to request access tokens.
To get the initial access token and refresh token, what is required is:
To get a refreshed access token however the client uses the following information:
This clearly shows the difference: when refreshing, the client receives authorization to refresh access tokens by using its client secret, and can thus re-authenticate the user using the refresh token instead of the user ID + password. This effectively prevents the user from having to re-enter his/her password.
This also shows that losing a refresh token is no problem because the client ID and secret are not known. It also shows that keeping the client ID and client secret secret is vital.
To further simplify B T's answer: Use refresh tokens when you don't typically want the user to have to type in credentials again, but still want the power to be able to revoke the permissions (by revoking the refresh token)
You cannot revoke an access token, only a refresh token.