Legend
I expose an API which requires client to sign requests by sending two headers:
Authorization: MyCompany access_key:<signature>
Unix-TimeStamp: <unix utc timestamp in seconds>
To create a signature part, the client should use a secret key issued by my API service.
In Python (Py3k) it could look like:
import base64
import hmac
from hashlib import sha256
from datetime import datetime
UTF8 = 'utf-8'
AUTH_HEADER_PREFIX = 'MyCompany'
def create_signature(access_key, secret_key, message):
new_hmac = hmac.new(bytes(secret_key, UTF8), digestmod=sha256)
new_hmac.update(bytes(message, UTF8))
signature_base64 = base64.b64encode(new_hmac.digest())
return '{prefix} {access_key}:{signature}'.format(
prefix=AUTH_HEADER_PREFIX,
access_key=access_key,
signature=str(signature_base64, UTF8).strip()
)
if __name__ == '__main__':
message = str(datetime.utcnow().timestamp())
signature = create_signature('my access key', 'my secret key', message)
print(
'Request headers are',
'Authorization: {}'.format(signature),
'Unix-Timestamp: {}'.format(message),
sep='\n'
)
# For message='1457369891.672671',
# access_key='my access key'
# and secret_key='my secret key' will ouput:
#
# Request headers are
# Authorization: MyCompany my access key:CUfIjOFtB43eSire0f5GJ2Q6N4dX3Mw0KMGVaf6plUI=
# Unix-Timestamp: 1457369891.672671
I wondered if I could avoid dealing with encoding digest of bytes to Base64 and just use HMAC.hexdigest()
to retrieve a string.
So that my function will change to:
def create_signature(access_key, secret_key, message):
new_hmac = hmac.new(bytes(secret_key, UTF8), digestmod=sha256)
new_hmac.update(bytes(message, UTF8))
signature = new_hmac.hexdigest()
return '{prefix} {access_key}:{signature}'.format(
prefix=AUTH_HEADER_PREFIX,
access_key=access_key,
signature=signature
)
But then I found that Amazon uses similar approach as in my first code snippet:
Authorization = "AWS" + " " + AWSAccessKeyId + ":" + Signature;
Signature = Base64( HMAC-SHA1( YourSecretAccessKeyID, UTF-8-Encoding-Of( StringToSign ) ) );
Seeing that Amazon doesn't use hex digest I stopped myself to move forward with it because maybe they know something I don't.
Update
I've measured a performance and found hex digest to be faster:
import base64
import hmac
import string
from hashlib import sha256
UTF8 = 'utf-8'
MESSAGE = '1457369891.672671'
SECRET_KEY = 'my secret key'
NEW_HMAC = create_hmac()
def create_hmac():
new_hmac = hmac.new(bytes(SECRET_KEY, UTF8), digestmod=sha256)
new_hmac.update(bytes(MESSAGE, UTF8))
return new_hmac
def base64_digest():
return base64.b64encode(NEW_HMAC.digest())
def hex_digest():
return NEW_HMAC.hexdigest()
if __name__ == '__main__':
from timeit import timeit
print(timeit('base64_digest()', number=1000000,
setup='from __main__ import base64_digest'))
print(timeit('hex_digest()', number=1000000,
setup='from __main__ import hex_digest'))
Results with:
3.136568891000934
2.3460130329913227
Question #1
Does someone know why do they stick to Base64 of bytes digest and don't use just hex digest? Is there some solid reason to keep using this approach over hex digest?
Question #2
According to RFC2716 the format of Authorization
header value when using Basic Authentication
is:
Authorization: Base64(username:password)
So basically you wrap with Base64 two values (user's id and password) seprated by colon.
As you can see in my code snippet and in Amazon's documentation nor me, nor Amazon do that for own custom value of the Authorization
header.
Would it be a better style to wrap the whole pair as Base64(access_key:signature)
to stick closer to this RFC or it doesn't matter at all?