Could anyone recommend a preferred algorithm to use for URL shortening? I'm coding using PHP. Initially I thought about writing something that would start at a character such as "a" and iterate through requests, creating records in a database and therefore having to increment the character to b, c, d ... A, B and so on as appropriate.
However it dawned on me that this algorithm could be pretty heavy/clumsy and there could be a better way to do it.
I read around a bit on Google and some people seem to be doing it with base conversion from the database's ID column. This isn't something I'm too familiar with.
Could someone elaborate and explain to me how this would work? A couple of code examples would be great, too.
I obviously don't want a complete solution as I would like to learn by doing it myself, but just an explanation/pseudo-code on how this would work would be excellent.
Most shortening services just use a counter that is incremented with every entry and convert the base from 10 to 64.
An implementation in PHP could look like this:
The
encode
function takes an integer number, converts it into bytes (pack
), encodes it with the Base-64 encoding (base64_encode
), trims the trailing padding=
(rtrim
), and replaces the characters+
and/
by-
and_
respectively (strtr
). Thedecode
function is the inverse function toencode
and does the exact opposite (except adding trailing padding).The additional use of
strtr
is to translate the original Base-64 alphabet to the URL and filename safe alphabet as+
and/
need to be encoded with the Percentage-encoding.i used to break ID by algorithm similar with how to convert from decimal to hex, but it will use 62 character instead of 16 character that hex would use.
example : if you will change ID = 1234567890 you will get kv7yl1 as your a key.
Here try this method :
It will provide you with hash value fit for a professional url shortener, e.g: '142ecd53'
You can use base_convert function to do a base convertion from 10 to 36 with the database IDs.
Or you can reuse some of the ideas presented in the comments on the page bellow:
http://php.net/manual/en/function.base-convert.php
I adopted a "light" solution. On user request I generate a unique identifier (checking for conflicts in db) with this python snipplet:
and store it in db.
Assuming your PRIMARY KEY is an INT and it auto_increments, the following code will get you going =).
EDIT: Included the base_convert from HGF's answer. I forgot to base_convert in the original post.