How many bytes in a JavaScript string?

I have a javascript string which is about 500K when being sent from the server in UTF-8. How can I tell its size in JavaScript?

I know that JavaScript uses UCS-2, so does that mean 2 bytes per character. However, does it depend on the JavaScript implementation? Or on the page encoding or maybe content-type?

标签： javascript string size byte

12条回答

Rolldiameter

2楼-- · 2019-01-04 09:05

A single element in a JavaScript String is considered to be a single UTF-16 code unit. That is to say, Strings characters are stored in 16-bit (1 code unit), and 16-bit is equal to 2 bytes (8-bit = 1 byte).

The charCodeAt() method can be used to return an integer between 0 and 65535 representing the UTF-16 code unit at the given index.

The codePointAt() can be used to return the entire code point value for Unicode characters, e.g. UTF-32.

When a UTF-16 character can't be represented in a single 16-bit code unit, it will have a surrogate pair and therefore use two code units( 2 x 16-bit = 4 bytes)

See Unicode encodings for different encodings and their code ranges.

0人赞添加讨论(0) 举报

Melony?

3楼-- · 2019-01-04 09:11

Try this combination with using unescape js function:

var byteAmount = unescape(encodeURIComponent(yourString)).length

Full encode proccess example:


    var s  = "1 a ф № @ ®"; //length is 11
    var s2 = encodeURIComponent(s); //length is 41
    var s3 = unescape(s2); //length is 15 [1-1,a-1,ф-2,№-3,@-1,®-2]
    var s4 = escape(s3); //length is 39
    var s5 = decodeURIComponent(s4); //length is 11

See aditional screen http://dl.dropbox.com/u/2086213/%3Dcoding%3D/js_utf_byte_length.png (I am a new user, so I can't use the img tag)

0人赞添加讨论(0) 举报

ゆ、 Hurt°

4楼-- · 2019-01-04 09:12

This function will return the byte size of any UTF-8 string you pass to it.

function byteCount(s) {
    return encodeURI(s).split(/%..|./).length - 1;
}

Source

JavaScript engines are free to use UCS-2 or UTF-16 internally. Most engines that I know of use UTF-16, but whatever choice they made, it’s just an implementation detail that won’t affect the language’s characteristics.

The ECMAScript/JavaScript language itself, however, exposes characters according to UCS-2, not UTF-16.

Source

0人赞添加讨论(0) 举报

兄弟一词,经得起流年.

5楼-- · 2019-01-04 09:13

I'm working with an embedded version of the V8 Engine. I've tested a single string. Pushing each step 1000 characters. UTF-8.

First test with single byte (8bit, ANSI) Character "A" (hex: 41). Second test with two byte character (16bit) "Ω" (hex: CE A9) and the third test with three byte character (24bit) "☺" (hex: E2 98 BA).

In all three cases the device prints out of memory at 888 000 characters and using ca. 26 348 kb in RAM.

Result: The characters are not dynamically stored. And not with only 16bit. - Ok, perhaps only for my case (Embedded 128 MB RAM Device, V8 Engine C++/QT) - The character encoding has nothing to do with the size in ram of the javascript engine. E.g. encodingURI, etc. is only useful for highlevel data transmission and storage.

Embedded or not, fact is that the characters are not only stored in 16bit. Unfortunally I've no 100% answer, what Javascript do at low level area. Btw. I've tested the same (first test above) with an array of character "A". Pushed 1000 items every step. (Exactly the same test. Just replaced string to array) And the system bringt out of memory (wanted) after 10 416 KB using and array length of 1 337 000. So, the javascript engine is not simple restricted. It's a kind more complex.

0人赞添加讨论(0) 举报

劳资没心，怎么记你

6楼-- · 2019-01-04 09:17

UTF-8 encodes characters using 1 to 4 bytes per code point. As CMS pointed out in the accepted answer, JavaScript will store each character internally using 16 bits (2 bytes).

If you parse each character in the string via a loop and count the number of bytes used per code point, and then multiply the total count by 2, you should have JavaScript's memory usage in bytes for that UTF-8 encoded string. Perhaps something like this:

      getStringMemorySize = function( _string ) {
        "use strict";

        var codePoint
            , accum = 0
        ;

        for( var stringIndex = 0, endOfString = _string.length; stringIndex < endOfString; stringIndex++ ) {
            codePoint = _string.charCodeAt( stringIndex );

            if( codePoint < 0x100 ) {
                accum += 1;
                continue;
            }

            if( codePoint < 0x10000 ) {
                accum += 2;
                continue;
            }

            if( codePoint < 0x1000000 ) {
                accum += 3;
            } else {
                accum += 4;
            }
        }

        return accum * 2;
    }

Examples:

getStringMemorySize( 'I'    );     //  2
getStringMemorySize( '❤'    );     //  4
getStringMemorySize( '


             
            
                                  
            
            
            
            
            
            倾城　Initia                          
            
             
             7楼-- · 2019-01-04 09:20
             
             
             
                          
             
                                                                          
String values are not implementation dependent, according the ECMA-262 3rd Edition Specification, each character represents a single 16-bit unit of UTF-16 text:


  4.3.16 String Value
  
  A string value is a member of the type String and is a
  finite ordered sequence of zero or
  more 16-bit unsigned integer values.
  
  NOTE Although each value usually
  represents a single 16-bit unit of
  UTF-16 text, the language does not
  place any restrictions or requirements
  on the values except that they be
  16-bit unsigned integers.

    
                                                                    
                                                        
            
              
                查看更多
                
             
              0人赞

                                                     添加讨论(0)

                                                                                                            
                               举报
                
                
                
                  
                


                        
                            

                               
             
                        
               
            

                            
                            
                                 加载中...
                            
                        

                
   1
2
下一页


     
                      登录 后发表回答



   
   
   
  
   相关问题
      
    
    
   
   

     


   
   Is there a limit to how many levels you can nest i   

   



     


   
   How to toggle on Order in ReactJS   

   



     


   
   void before promise syntax   

   



     


   
   Keeping track of variable instances   

   



     


   
   how to split a list into a given number of sub-lis   

   



        
      
    查看全部
   
   
  
   相关文章
 
   
   

     


   
   实时推送的大数据量（通过websocket)，造成页面数据加载比较慢，应该怎么改善？   

     


   
   反爬能检测到JS模拟的键盘输入吗   

     


   
   VUE的v-for中深入响应式原理的问题   

     


   
   做一个留言板，求动态修改数据文件的思路？   

     


   
   vue的data()中的值能否递归调用   

     


   
   浅拷贝的问题   

     


   
   javascript案例隐藏密码--有关元素选取的问题   

     


   
   JavaScript让一个变量影响另一个变量内容   

        
        
    查看全部
                 收藏的人(5)

How many bytes in a JavaScript string?

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间