javascript string compression with localStorage

2019-01-31 13:27发布

I am using localStorage in a project, and it will need to store lots of data, mostly of type int, bool and string. I know that javascript strings are unicode, but when stored in localStorage, do they stay unicode? If so, is there a way I could compress the string to use all of the data in a unicode byte, or should i just use base64 and have less compression? All of the data will be stored as one large string.

EDIT: Now that I think about it, base64 wouldn't do much compression at all, the data is already in base 64, a-zA-Z0-9 ;: is 65 characters.

5条回答
老娘就宠你
2楼-- · 2019-01-31 13:54

This Stackoverflow Question has an answer that may help. There is a link to a JavaScript compression library.

查看更多
The star\"
3楼-- · 2019-01-31 14:03

"when stored in localStorage, do they stay unicode?"

The Web Storage working draft defines local storage values as DOMString. DOMStrings are defined as sequences of 16-bit units using the UTF-16 encoding. So yes, they stay Unicode.

is there a way I could compress the string to use all of the data in a unicode byte...?

"Base32k" encoding should give you 15 bits per character. A base32k-type encoding takes advantage of the full 16 bits in UTF-16 characters, but loses a bit to avoid tripping on double-word characters. If your original data is base64 encoded, it only uses 6 bits per character. Encoding those 6 bits into base32k should compress it to 6/15 = 40% of its original size. See http://lists.xml.org/archives/xml-dev/200307/msg00505.html and http://lists.xml.org/archives/xml-dev/200307/msg00507.html.

For even further reduction in size, you can decode your base64 strings into their full 8-bit binary, compress them with some known compression algorithm (e.g. see javascript implementation of gzip), and then base32k encode the compressed output.

查看更多
4楼-- · 2019-01-31 14:03

I recently had to save huge JSON objects in localStorage.

Firstly, yeah, they do stay unicode. But don't try to save something like an object straight to local storage. It needs to be a string.

Here are some compression techniques I used (that seemed to work well in my case), before converting my object to a string:

Any numbers can be converted from a base of 10 to a base of 36 by doing something like (+num).toString(36). For example the number 48346942 will then be "ss8qm" which is (including the quotes) 1 character less. It is possible that the addition of the quotes will actually add to the character count. So the larger the number the better the payoff. To convert it back you would do something like parseInt("ss8qm", 36).

If you are storing an object with any key that will repeat it's best to create a lookup object where you assign a shortened key to the original. So, for the sake of example, if you have:

{
    name: 'Frank',
    age: 36,
    family: [{
        name: 'Luke',
        age: 14,
        relation: 'cousin'
    }, {
        name: 'Sarah',
        age: 22,
        relation: 'sister'
    }, {
        name: 'Trish',
        age: 31,
        relation: 'wife'
    }]
}

Then you could make it:

{
    // original w/ shortened keys
    o: {    
        n: 'Frank',
        a: 36,
        f: [{
            n: 'Luke',
            a: 14,
            r: 'cousin'
        }, {
            n: 'Sarah',
            a: 22,
            r: 'sister'
        }, {
            n: 'Trish',
            a: 31,
            r: 'wife'
        }]
    },

    // lookup
    l: {
        n: 'name',
        a: 'age',
        r: 'relation',
        f: 'family'
    }
}

Again, this pays off with size. And repetition. In my case it worked really well. But it depends on the subject.

All of these require a function to shrink and one to expand back out.

Also, I would recommend creating a class that is used to store & retrieve data from local storage. I ran into there not being enough space. So the writes would fail. Other sites may also write to local storage which can take away some of that space. See this post for more details.

What I did, in the class I built, was first attempt to remove any item with the given key. Then attempt the setItem. These two lines are wrapped with a try catch. If it fails then it assumes the storage is full. It will then clear everything in localStorage in an attempt to make room for it. It will then, after the clear, attempt to setItem again. This, too, is wrapped in a try catch. Since it may fail if the string itself is larger than what localStorage can handle.

EDIT: Also, you will come across the LZW compression a lot of people mention. I had implemented that, and it worked for small strings. But with large strings it would begin using invalid characters which resulted in corrupt data. So just be careful, and if you go in that direction test test test

查看更多
再贱就再见
5楼-- · 2019-01-31 14:07

You could encode to Base64 and then implement a simple lossless compression algorithm, such as run-length encoding or Golomb encoding. This shouldn't be too hard to do and might give you a bit of ompression.

Golomb encoding

I also found JsZip. I guess you could check the code and only use the algorithm, if it is compatible.

Hope this helps.

http://jszip.stuartk.co.uk/

查看更多
姐就是有狂的资本
6楼-- · 2019-01-31 14:17

Base64 compression for javascript is very well explained at this blog. Implementation is also available here when using entire framework.

查看更多
登录 后发表回答