Is PHP's json_encode guaranteed to produce ASC

2019-05-29 01:37发布

Well, the subject says everything. I'm using json_encode to convert some UTF8 data to JSON and I need to transfer it to some layer that is currently ASCII-only. So I wonder whether I need to make it UTF-8 aware, or can I leave it as it is.

Looking at JSON rfc, UTF8 is also valid charset in JSON output, although not recommended, i.e. some implemenatations can leave UTF8 data inside. The question is whether PHP's implementation dumps everthing as ASCII or opts to leave something as UTF-8.

标签: php utf-8 json
3条回答
何必那么认真
2楼-- · 2019-05-29 02:02

Unlike JSON support in other languages, json_encode() does not have the ability to generate anything other than ASCII.

查看更多
家丑人穷心不美
3楼-- · 2019-05-29 02:17

According to the JSON article in Wikipedia, Unicode characters in strings are always

double-quoted Unicode with backslash escaping

The examples in the PHP Manual on json_encode() seem to confirm this.

So any UTF-8 character outside ASCII/ANSI should be escaped like this: \u0027 (note, as @Ignacio points out in the comments, that this is the recommended way to deal with those characters, not a required one)

However, I suppose json_decode() will convert the characters back to their byte values? You may get in trouble there.

If you need to be sure, take a look at iconv() that could convert your UTF-8 String into ASCII (dropping any unsupported characters) beforehand.

查看更多
手持菜刀,她持情操
4楼-- · 2019-05-29 02:19

Well, json_encode returns a string. According to the PHP documentation for string:

A string is series of characters. Before PHP 6, a character is the same as a byte. That is, there are exactly 256 different characters possible. This also implies that PHP has no native support of Unicode. See utf8_encode() and utf8_decode() for some basic Unicode functionality.

So for the time being you do not need to worry about making it UTF-8 aware. Of course you still might want to think about this anyway, to future-proof your code.

查看更多
登录 后发表回答