How do I generate a unique ID value that can be easily passed on via phone or email, that can be easily remembered while still not being easily guessable.
I am using database. But as I am giving away the ID to people I do not want it to be bound to a database. I could do something with the unique ID I already have in database, but cannot use it directly, to avoid it being guessable.
I am using Python and have tried using uuid
, but uuid
is too long to be human readable.
Is there any way to create a human friendly pronounceable ID?
What you want to do is stitch together syllables to create pronounceable pseudo words. You can create syllables in any language you like to make up words that can be pronounced and communicated but don't actually mean anything.
Here is an article about how one person created human readable UIDs for speaking them phonetically and some of the pitfalls.
Read the above link for just some of the pitfalls you should consider when taking an approach like this.
You could just use a string of alphabetic letters but present them as the NATO phonetic alphabet instead of just the alphabet.
For emails, what I use is:
from base64 import b64encode
from os import urandom
key = b64encode(urandom(9))
You can increase/decrease the length by changing the number.
Sometimes you will get + and / characters and you can strip them out if you like.
Edit:
Since you also want to pass them over the phone maybe b32encode(urandom(5))
would be a better choice since it wont give you any lowercase or unusual characters.
How about something like Amazon's payphrases? Convert the binary ID to a sequence of english words.
If you want something with the same range as a UUID, you need to represent 16 bytes.
To keep it reasonable, restrict the phrase to 4 words, so each word represents 4 bytes, or 65536 possibilities, so you'll need a dictionary of 262,144 words.
EDIT:
Actually on reflection what might be better is a sort of mad lib sentence - it will restrict the number of needed words and may make it easier to remember since it has a grammatical structure. It will need to be longer, of course, perhaps something like this:
(a/an/the/#) (adj) (noun) (verb)(tense) (adverb) while (a/an/the/#) (adj) (noun) (verb) (adverb).
Sure, but it requires a few more restrictions on your problem space, namely:
- There is only one thing generating unique IDs
- Your items have some concept of a title
- You can persist a list of strings
Then you'd do something like:
_UID_INTERNALS = set()
def getID(obj):
if hasattr(obj, 'UID'):
return obj.UID
title = obj.title.encode("ascii", errors="ignore")
title = title.lower()
title = "-".join(title.split())
if not title:
title = "unnamed-object"
UID = title
num = 1
while UID in _UID_INTERNALS:
UID = title + str(num)
num += 1
_UID_INTERNALS.add(UID)
obj.UID = UID
return UID
Here's a uuid-based example. Adjust the 1000000 to increase or decrease the range of your ids. Since you're reducing the range of the id, you'll probably have to check to see if the ID already exists.
>>> import uuid
>>> hash(str(uuid.uuid1())) % 1000000
380539
>>> hash(str(uuid.uuid1())) % 1000000
411563