I've been pulling my hair out over this issue lately.
Some background
- using the oauth2client library to manage tokens of users. The tokens are used to do a variety of background tasks periodically and concurrently.
- each time one of these tasks is about to run for a user, we get the Credentials object from storage and do a refresh if the expiry is ~5 minutes away. Otherwise the current access token is re-used.
- it happens that sometimes multiple tasks for 1 user are running at the same time
- for a while everything works fine, the tokens get refreshed normally
- intermittently and seemingly out of the blue, during one of these attempted refresh, an "invalid_grant" error is returned, and it completely invalidates the refresh token in storage. (When/how/why it happens is what I hope to figure out with this question)
Searching around, there are a lot of threads/reports about this issue. But all I've found so far don't apply to our case. I'll try to list the ones I have looked into so far:
- User has revoked permissions
This one is the most obvious, most documented and is easily reproducible, but unfortunately in our case our users (or ourselves, while testing), didn't revoke permissions at all.
- Refreshing an "old" access token
At first I thought that there can only be 1 valid access token at a time for a user. That is false, and verified on OAuth2 Playground.
There's a limit of 25 active tokens per user per client. Once that limit is reached, older access tokens are invalidated silently, even if their expiration date hasn't passed yet.
This is a dead end for us as well, since our issue happens when refreshing, not using the oldest access token. And this limit only affects access tokens, not refresh tokens.
- Requesting a refresh too many times in a short amount of time
Not documented at all. Only mentioned in passing with no references. Tried to emulate it by doing a refresh 25 times in 7 seconds, but all went well. But with no references, this is a shot in a dark. And our background tasks only ever max at ~10 tasks every several minutes. Moving on.
- Concurrent refreshes cause a race condition on which token succeeds
I've asked a question here, but this wasn't the case. Tested on AppEngine by running two tasks scheduled at the same time.
I'm at my wit's end trying to pin down this issue. The fact that we can't readily reproduce it is a pain. I'd really like some insight on what could possibly be causing this that I've missed?
Here's our refresh code:
def refresh_oauth_credentials(user, credentials, force=False):
if not credentials:
return None
logging.debug(credentials.token_expiry)
do_refresh = credentials._expires_in() < 300
if force or do_refresh:
h = httplib2.Http()
try:
logging.debug('Refreshing %s\'s oauth2 credentials...' % user.email)
credentials.refresh(h)
except AccessTokenRefreshError:
logging.warning('Failed to refresh.')
return None
return credentials
The message is essentially saying the refresh token is either invalid (expired, revoked, etc) or doesn't match the access token request details (user, scope). So where in your question you said "an "invalid_grant" error is returned, and it completely invalidates the refresh token in storage", it's kinda the other way round, ie. the refresh token is invalid for some reason, and that is causing the "invalid grant".
I've seen this a lot during development if during your dev workflow/testing you are getting new refresh tokens for a user.