I want to develop something similar to jsfiddle in where the user can input some data and then "save" it and get a unique random looking url that loads that data.
I don't want to make the saves sequential because I don't want anyone to grab all of my entries, as some can be private. However on the server I would like to save it in sequential order.
Is there a function or technique that converts a number into a hash that has 4 charactors without any collisions until (62 * 62 * 62 * 62 === 14776336)
entries?
For example the first entry on the server will be named 1
on the server but iUew3
to the user, the next entry will be 2
on the server but ueGR
to the user...
EDIT: I'm not sure if it's obvious but this hash-like function needs to be reversible because when the user requests ueGR
the server needs to know to server it file 2
In my opinion if you also keeping the
save time of entry
on server, you can generate a hash function.hash = func(id, time)
but with onlyhash = func(id)
gonna be to easy to resolveHere's how I implemented it. Here's the save.php file (can someone tell me if there are any design flaws in it):
And here's load.php:
Hope this helps others.
Here's a reversible lib that works w/ bcmath
http://blog.kevburnsjr.com/php-unique-hash
It's an odd set of constraints. I routinely use MD5 checksums to generate unique URLs from data. If the user doesn't already have the data, they can't guess the URLs.
I do understand about not wanting to use a database—if you've never used one before, the learning curve can be a little steep.
I don't understand the constraint about "storing things sequentially on the server." If you need to know the order in which the hashes are created, I'd simply put that information in a separate file. You might have to do file locking or some other kind of hack to make sure you can append a hash to that file incrementally.
If you want short URLs, you can either take a prefix of an MD5 checksum or you can take a CRC-32 and base64 encode it. Both will give you unique URLs with reasonably good probability.
This can't really be reversible. The only way (the one used by url shorteners and jsfiddle) is to store the generated hash (actually it's a digest) in a table/data structure of some sort and *look it up on retrieval.
Why this?
Passing from, e.g. 128 chars of data → a 4 visible char digest, you lose a lot of data.
You cannot store the remaining data in the magical cracks betweeen those 4 bytes, there are none.
It's possible to do this, but I would suggest using 64 characters, as that will make it a lot easier. 4 6bit characters = 24bits.
Use a combination of these:
LFSR is highly recommended as it will do a good scrambling. The rest are optional. All of these manipulations are reversible and guarantee that each output is going to be unique.
When you calculated the "shuffled" number simply pack it to a binary string and encode it with
base64_encode
.For decoding simply do the inverse of these operations.
Sample (2^24 long unique sequence):
Output:
Note: for URL replace
+
and/
with-
and_
.Note: although this works, for a simple scenario like yours it's probably easier to create a random filename, till it doesn't exist. nobody cares about the number of the entry.