How to generate a human friendly unique ID in Pyth

2019-02-04 18:22发布

问题:

How do I generate a unique ID value that can be easily passed on via phone or email, that can be easily remembered while still not being easily guessable.

I am using database. But as I am giving away the ID to people I do not want it to be bound to a database. I could do something with the unique ID I already have in database, but cannot use it directly, to avoid it being guessable.

I am using Python and have tried using uuid, but uuid is too long to be human readable.

Is there any way to create a human friendly pronounceable ID?

回答1:

What you want to do is stitch together syllables to create pronounceable pseudo words. You can create syllables in any language you like to make up words that can be pronounced and communicated but don't actually mean anything.

Here is an article about how one person created human readable UIDs for speaking them phonetically and some of the pitfalls.

Read the above link for just some of the pitfalls you should consider when taking an approach like this.

You could just use a string of alphabetic letters but present them as the NATO phonetic alphabet instead of just the alphabet.



回答2:

For emails, what I use is:

from base64 import b64encode
from os import urandom
key = b64encode(urandom(9))

You can increase/decrease the length by changing the number. Sometimes you will get + and / characters and you can strip them out if you like.

Edit: Since you also want to pass them over the phone maybe b32encode(urandom(5)) would be a better choice since it wont give you any lowercase or unusual characters.



回答3:

How about something like Amazon's payphrases? Convert the binary ID to a sequence of english words.

If you want something with the same range as a UUID, you need to represent 16 bytes. To keep it reasonable, restrict the phrase to 4 words, so each word represents 4 bytes, or 65536 possibilities, so you'll need a dictionary of 262,144 words.

EDIT: Actually on reflection what might be better is a sort of mad lib sentence - it will restrict the number of needed words and may make it easier to remember since it has a grammatical structure. It will need to be longer, of course, perhaps something like this:

(a/an/the/#) (adj) (noun) (verb)(tense) (adverb) while (a/an/the/#) (adj) (noun) (verb) (adverb).



回答4:

Sure, but it requires a few more restrictions on your problem space, namely:

  1. There is only one thing generating unique IDs
  2. Your items have some concept of a title
  3. You can persist a list of strings

Then you'd do something like:

_UID_INTERNALS = set()

def getID(obj):
    if hasattr(obj, 'UID'):
        return obj.UID
    title = obj.title.encode("ascii", errors="ignore")
    title = title.lower()
    title = "-".join(title.split())
    if not title:
        title = "unnamed-object"
    UID = title
    num = 1
    while UID in _UID_INTERNALS:
        UID = title + str(num)
        num += 1
    _UID_INTERNALS.add(UID)
    obj.UID = UID
    return UID


回答5:

Here's a uuid-based example. Adjust the 1000000 to increase or decrease the range of your ids. Since you're reducing the range of the id, you'll probably have to check to see if the ID already exists.

>>> import uuid
>>> hash(str(uuid.uuid1())) % 1000000
380539
>>> hash(str(uuid.uuid1())) % 1000000
411563