I know that JavaScript strings are usually encoded with an encoding taking at least two bytes per character (UTF-16 or UCS-2).
However, when constructing a Blob, a different encoding appears to be used because when I read it as ArrayBuffer
, the length of the returned buffer is 3 for an Euro sign.
var b = new Blob(['€']);
According to the W3C, it is UTF-8 encoded.
Demo:
// Create a Blob with an Euro-char (U+20AC)
var b = new Blob(['€']);
var fr = new FileReader();
fr.onload = function() {
ua = new Uint8Array(fr.result);
// This will log "3|226|130|172"
// E2 82 AC
// In UTF-16, it would be only 2 bytes long
console.log(
fr.result.byteLength + '|' +
ua[0] + '|' +
ua[1] + '|' +
ua[2] + ''
);
};
fr.readAsArrayBuffer(b);
Play with that on JSFiddle.