I'm writing myself a script which basically lets me send a URL and two integer dimensions in the querystring of a single get request. I'm using base64 to encode it, but its pretty damn long and I'm concerned the URL may get too big.
Does anyone know an alternative, shorter method of doing this? It needs to be decode-able when received in a get request, so md5/sha1 are not possible.
Thanks for your time.
Edit: Sorry - I should have explained better: Ok, on our site we display screenshots of websites that get posted up for review. We have our own thumbnail/screenshot server. I'm basically going to be having the image tag contain an encoded string that stores the URL to take a screenshot of, and the width/height of the image to show. I dont however want it in 'raw-text' for the world to see. Obviously base64 can be decided by anyone, but we dont want your average joe picking up the URL path. Really I need to fetch: url, width, height in a single GET request.
It sounds like your goals are 1. to visually obscure a URL, and 2. to generally encode the data compactly for use in a URL.
First, we need to obscure the URL. Since URLs use much of the Base64 dictionary, any encoding that produces binary (that then has to be Base64-ed) will likely just increase the size. It's best to keep the dictionary in the URL-safe range with minimal need for escaping when
urlencode()
is applied. I.e. you want this:Now, for saving bytes, we can encode the URL schema into one char (say,
h
for HTTP,H
for HTTPS), and convert the dimensions into base 32. Wrapping this up:Since we avoided non URL-safe chars, if this is put in a querystring (with
urlencode
), it doesn't grow much (in this case not at all).Additionally you might want to sign this string so people who know the encoding still can't specify their own parameters via the URL. For this you'd use HMAC, and Base64URL-encode the hash. You can also just keep a substring of the hash (~6 bits per character) to save space.
sign()
(below) adds an 8 character MAC (48 bits of the hash at 6 bits/char):Update: a better RotURL function.
Since you are only using base64 to obfuscate the string, you could just obfuscate it with something else, like rot13 (or your own simple letter substitution function). So,
urlencode(str_rot13($str))
to encode andstr_rot13(urldecode($str))
to decode.Or, to just have a shorter base64-encoded string, you could compress the string before base64 encoding it:
base64_encode(gzencode($str, 9))
andgzdecode(base64_decode($str))
to decode.Or, if this is primarily a security issue (you don't mind people seeing the URL, you just want to keep people from hacking it) you could pass these parameters with normal querystring variables, but with a hash appended to prevent tampering. i.e.:
(Off topic: People are saying you should use POST instead of GET. If all these URLs are doing is fetching screenshots from your database to display (i.e. a search lookup), then GET is fine and correct. But if calling these URLs is actually performing an action like going to another site, making and storing the screenshot, then that's a POST. As their names suggest, GET is for retrieval; POST is for submitting data. If you were to use GET on an expensive operation like making the screenshot, you could end up DOSing your own site when Google etc. index these URLs.)
Just don't
base64_encode($whole_file)
. Send the content in chunks and encode the chunks. Also, if you must know how bigger your chunk can get after a call tobase64_encode()
, it will more than double in size (but less than2.1*strlen($chunk)
)URLs are not meant to be sending long strings of data, encoded or not encoded. After a certain point, when you're dealing with such large amounts of data being sent through the URL you should just start using POST or some form of local storage. FYI, IE has a URL limit of 2038 characters.
EDIT: I don't understand one thing. Why aren't you caching the screen shots? It seems awfully resource intensive to have to take a new screenshot every time somebody views a page with an IMG link to that url.
Maybe your audience is small, and resources are not an issue. However, if it is the opposite and in fact it is a public website-that will not scale very well. I know I'm going beyond what your original question asked, but this will solve your question and more.
As soon as the website is posted up, store the url in some sort of local storage, preferably in sql. I am going to continue this example as if you choose SQL, but of course your implementation is your choice. I would have a primary key, url field, and last_updated timestamp, and optionally an image thumbnail path.
By utilizing local storage, you can now pull the image off a cached copy stored locally on the server every time the page with the thumbnail is requested. A significant amount of resources is saved, and since chances are that those websites aren't going to be updated very often, you can have a cron job or a script that runs every x amount of time to refresh the screenshots in the entire database. Now, all you have to do is directly link (again this depends on your implementation) to the image and none of this huge url string stuff will happen.
OR, just take the easy way and do it client side with http://www.snap.com/
You can still use POST for what you describe assuming I understood your correctly, I may not have.
I'm guessing you're doing something like this:
instead do something like this:
Is the script that generates the URLs running on a different server from the script that interprets them? If they're on the same server, the obvious approach would be to store the target URL, width, and height in a database, and simply pass a randomly-generated record identifier in the query string.