How can I estimate the disk size of a string with

2019-04-24 06:41发布

问题:

I need to try to estimate the DISK size of a text string (which could be raw text or a Base64 encoded string for an image/audio/etc) in JavaScript. I'm not sure how to estimate this. The only thing when Googling i can find is .length so i thought maybe someone on StackOverflow might know...

The reason i need to know is i have a localStorage script that needs (or would love to have) the ability to check when a user is nearing his 5MB (or 10MB in IE) quota and prompt them to increase the max size for the domain. So, if a user hits, lets say, 4.5MBs of data it'd prompt with

You're nearing your browsers 5MB data cap. Please increase your max data by... [instructions on increasing it for the browser]

回答1:

It's not going to be exact, but you can count the number of bytes in a string to get a rough estimation.

function bytes(string) {
    var escaped_string = encodeURI(string);
    if (escaped_string.indexOf("%") != -1) {
        var count = escaped_string.split("%").length - 1;
        count = count == 0 ? 1 : count;
        count = count + (escaped_string.length - (count * 3));
    }
    else {
        count = escaped_string.length;
    }

return count;

}

var mystring = 'tâ'; alert(bytes(mystring));



回答2:

It is going to depend on your character encoding. If you use ASCII encoding, it's going to be str.length bytes. If you use UTF-16, it's going to be (str.length * 2) bytes. If you use UTF-8, it is going to depend on the characters in the string. (Some characters will only take 1 byte, but others could take up to 4 bytes.) If you're dealing with Base64-encoded data, the characters are all within the ASCII range and therefore would occupy str.length bytes on disk. If you decode them first and save as binary, it would take (str.length * 3/4) bytes. (With Base64, 3 uncoded bytes become 4 coded bytes.)

BTW - If you haven't read Joel Spolsky's The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!), you should do so immediately.

http://www.joelonsoftware.com/articles/Unicode.html

UPDATE: If you're using localStorage, I assume that you're familiar with window.localStorage.length, though this only tells you how much has been used, not whether your new data will fit. I would also highly recommend reading Dive into HTML5, especially the section on storage:

http://diveintohtml5.ep.io/storage.html

Unless something has changed since its writing, I'm not sure what you can do as localStorage limits you to 5MB per domain with no way for the user to increase it.



回答3:

If you are talking about memory usage, then no. There is no way of reliably determining the used memory (at least implementation-independently), since this is not part of the ECMAScript spec. It depends on your character encoding.



回答4:

It depends on the data in your string and the way it is stored. If your Base64 encoded string is stored as a Base64 encoded string, the length is the same as the size on disk. If not, you have to decode it

I found a solution (although it seems a bit icky) here

 function checkLength() {
    var countMe = document.getElementById("someText").value
    var escapedStr = encodeURI(countMe)
    if (escapedStr.indexOf("%") != -1) {
        var count = escapedStr.split("%").length - 1
        if (count == 0) count++  //perverse case; can't happen with real UTF-8
        var tmp = escapedStr.length - (count * 3)
        count = count + tmp
    } else {
        count = escapedStr.length
    }
    alert(escapedStr + ": size is " + count)
 }


回答5:

You can count the number of bytes in a string by this simple and precise way

var head = 'data:image/png;base64,';
var imgFileSize = Math.round((string.length - head.length)*3/4) ;

console.log("size is ",imgFileSize);