I was thinking about how to make a small function
that generates a YouTube
-like ID
and does not cause collisions. The charset should be A-Za-z0-9-_
and lets say I want it to be 10 characters
long which would equal 64 ^ 10 / Quadrillion
- 1153 Quadrillion
possibilities.
It's rather important to be small
due to the fact that it's one of the major functionality of the website and should be clear enough to be modified easily. Within seconds in case an emergency occurs.
It doesn't have to be secure
in the sense that we don't really mind whether it's easily to convert back to an ID
as an outsider. It should however not be a pattern such as:
aaaaaaaaaa
aaaaaaaaab
aaaaaaaaac
I'm aware that you could theoretically make a charset
and loop through it randomly until you have 10
random characters but this could collide ( I realize this chance is slim and you could perform a query to see whether the key
has been generated before but I would like to keep dependencies minimal and it not relying on SQL
).
I can also imagine making a charset
of 100 characters
as in 0-99
and then prepend and/or append random
characters if the length of 10
condition is not met. Eg:
- IMPORTANT: the scenario sketched below assumes
A
is the first character in the set and_
is the last one.
id: 0 = A+9 random characters
id: 1 = B+9 random characters
id: 99 = _+9 random characters
id: 990 = _A+8 random characters
id: 99999999999999999999 = __________
I would think that would prevent collisions
but I don't have a charset
of 100
but 64
.
Hope someone can toss some ideas.
- UPDATE1: The above sketch could actually collide in the following (and more) circumstance(s):
If the
ID
is99
it doesn't mean that the next character can't beA
bychance
although very unlikely; it could thereforecollide
with990
:-(UPDATE2: I don't think it would still collide if we use a
static component
after theID
and before therandom component
that is not part of thecharset
.Updated Sketch (the static component will be $):