When working on *nix system, I always set the locale as en_US.UTF-8
, and then this can help me display the Chinese correctly on the stdout.
But I know that there is also zh_CN.UTF-8 for locale setting as well, so I want to know:
What the the different of them?
When should I use zh_CN.UTF-8 or en_US.UTF-8?
Having zero knowledge about zh in itself, changing between the two locales you mentioned may change how certain characters are treated at a word boundary and how different programs produce output.
For example
LC_CTYPE=zh_CN.UTF-8
will most likely consider characters with accent marks as "being part of a word" whereasLC_CTYPE=en_US.UTF-8
might not consider those being part of a word.Same goes for date and currency formats. As I'm pretty sure zh will have different date/currency format than us.
To give you a concrete example, here is what I get from date(1) with two different locales in a relatively recent Ubuntu GNU/Linux system:
According to the documentation here:
If two locales both have UTF-8 in their names, they have the same encoding. Their difference resides in locale-dependent settings. For example, time format as @Sami Laine has already pointed out; monetary sign, in
zh_CN.UTF-8
, the money sign is¥
while inen_US.UTF-8
, the money sign is$
.More complete list of differences
According to here, for a more complete difference between the two locales, run the follwoing script,
The above script should give a more detailed difference between the two locales.