String encoding when constructing a Blob

2019-07-17 15:17发布

I know that JavaScript strings are usually encoded with an encoding taking at least two bytes per character (UTF-16 or UCS-2).

However, when constructing a Blob, a different encoding appears to be used because when I read it as ArrayBuffer, the length of the returned buffer is 3 for an Euro sign.

var b = new Blob(['€']);

1条回答
SAY GOODBYE
2楼-- · 2019-07-17 15:27

According to the W3C, it is UTF-8 encoded.

Demo:

// Create a Blob with an Euro-char (U+20AC)
var b = new Blob(['€']);
var fr = new FileReader();

fr.onload = function() {
  ua = new Uint8Array(fr.result);
  // This will log "3|226|130|172"
  //                  E2  82  AC
  // In UTF-16, it would be only 2 bytes long
  console.log(
    fr.result.byteLength + '|' + 
    ua[0]  + '|' + 
    ua[1] + '|' + 
    ua[2] + ''
  );
};
fr.readAsArrayBuffer(b);

Play with that on JSFiddle.

查看更多
登录 后发表回答