I realize that the OAuth spec doesn't specify anything about the origin of the ConsumerKey, ConsumerSecret, AccessToken, RequestToken, TokenSecret, or Verifier code, but I'm curious if there are any best practices for creating significantly secure tokens (especially Token/Secret combinations).
As I see it, there are a few approaches to creating the tokens:
- Just use random bytes, store in DB associated with consumer/user
- Hash some user/consumer-specific data, store in DB associated with consumer/user
- Encrypt user/consumer-specific data
Advantages to (1) are the database is the only source of the information which seems the most secure. It would be harder to run an attack against than (2) or (3).
Hashing real data (2) would allow re-generating the token from presumably already known data. Might not really provide any advantages to (1) since would need to store/lookup anyway. More CPU intensive than (1).
Encrypting real data (3) would allow decrypting to know information. This would require less storage & potentially fewer lookups than (1) & (2), but potentially less secure as well.
Are there any other approaches/advantages/disadvantages that should be considered?
EDIT: another consideration is that there MUST be some sort of random value in the Tokens as there must exist the ability to expire and reissue new tokens so it must not be only comprised of real data.
Follow On Questions:
Is there a minimum Token length to make significantly cryptographically secure? As I understand it, longer Token Secrets would create more secure signatures. Is this understanding correct?
Are there advantages to using a particular encoding over another from a hashing perspective? For instance, I see a lot of APIs using hex encodings (e.g. GUID strings). In the OAuth signing algorithm, the Token is used as a string. With a hex string, the available character set would be much smaller (more predictable) than say with a Base64 encoding. It seems to me that for two strings of equal length, the one with the larger character set would have a better/wider hash distribution. This seems to me that it would improve the security. Is this assumption correct?
The OAuth spec raises this very issue in 11.10 Entropy of Secrets.
OAuth says nothing about token except that it has a secret associated with it. So all the schemes you mentioned would work. Our token evolved as the sites get bigger. Here are the versions we used before,
Our first token is an encrypted BLOB with username, token secret and expiration etc. The problem is that we can't revoke tokens without any record on host.
So we changed it to store everything in database and the token is simply an random number used as the key to the database. It has an username index so it's easy to list all the tokens for an user and revoke it.
We get quite few hacking activities. With random number, we have to go to database to know if the token is valid. So we went back to encrypted BLOB again. This time, the token only contains encrypted value of the key and expiration. So we can detect invalid or expired tokens without going to the database.
Some implementation details that may help you,
What about putting ip adress in token ?
Then you could on each request decrypt token and check request ip adress with token ip adress.
Maybe that aproach would be more secure against random generated attacks.
Best!
Edited: Ok i think maybe i was not clear enaugh. My post was question not an answer. I will not try to find article on the internet where i read implementation with IP adress and i agree that if you put IP adress in token, clients with dynamic ip adresses and mobile connections will have token problems.