how to shorten the url in a mathematical approach

2020-04-11 11:37发布

I am looking into the possibility of shortening / encrypting a url which will be from 150 to 250 characters in length to maximum 12 characters using an algorithm. Initially I am doubting it's possible but I wanted to leverage the great minds of StackOverflow :)

the algorithm should be one that can be written in classic asp, sql, c#, vb, or foxpro or other language.

Is that even possible without a database centric approach?

I was reading here that AES in CFB mode, will do a stream cipher and the output length will be the same as the input length. Is there any way to shorten it even more?

4条回答
成全新的幸福
2楼-- · 2020-04-11 11:46

Sorry, but no way for that. In this case shorten means losing the unique information. Maybe you could generate unique key(hash) for every string, but it will not help you unpack the data, unless there is no dictionary(static information) provided.

Check how ZIP or RAR is working, for example

查看更多
迷人小祖宗
3楼-- · 2020-04-11 11:58

Is your goal primarily to shorten or to encrypt? You could probably specialize a compression algorithm enough to store a known-character-set URL dramatically, but that would not be effective for encryption purposes. I strongly doubt you could get a cryptographically sound encryption algorithm to achieve the specified level of compression, not to mention you're not discussing the allowable key lengths which might be tied into your scheme.

查看更多
爱情/是我丢掉的垃圾
4楼-- · 2020-04-11 12:08

You did not mention if you are looking to only reduce the URL length or use the shortened URL for something (like tinyurl does). Is that your intent? If that is the case, then you can create the hash for the URL and use that hash internally to map to the actual URL. Then your choice of short length URLs depends on hashing algorithms. Based on your intent you can chose one of the options suggested in the replies.

查看更多
甜甜的少女心
5楼-- · 2020-04-11 12:10

The answer, as always, is "it depends". There's a mathematical theory that talks about the "information content" of a bunch of data. If your data is originally strings like this:

lleAgByD2rREjzqj85g68207NsjspdINfPRNvU9udgWw7y4qXh0EQLSy0yEi2

then the information content is much greater than if your strings look like this:

one zero one one zero one zero zero one zero one one zero one

even though the strings are actually the same length. Using compression, you can reduce the number of bits necessary to express the same meaning, but only down to a point. That point depends on the information content of the original message.

It seems unlikely to me that your string of 150 to 250 characters has so little information content that it could be effectively compressed down to 12 characters. You may have to store the longer data in the database and assign a shorter "key" to each data item.

For further reading, one place to start is the Wikipedia article on Information theory.

查看更多
登录 后发表回答