cmd's character set

2019-04-29 07:09发布

问题:

C:\Users\Kolink>php -r "echo 'é';"
Ú

C:\Users\Kolink>echo é
é

As you can see, a program outputting an é results in a Ú, but using the echo command gives the desired character.

And, can I configure PHP (maybe some command at the start of the script) to output the correct character?

回答1:

CMD.exe works in different codepage than PHP. By the way, it is also different codepage than default Windows codepage. This is for compatibility with older MS-DOS programs. In my country Windows uses Windows-1250 and cmd.exe uses DOS Latin2. I suppose in UK this would be Windows-1252 and DOS Latin1 respectively.

To get the same results you have to use the same codepage in PHP and in cmd.exe. Check what codepage is used by PHP and set cmd.exe to the same codepage. To do this, use the following command: mode con sp select=<codepagenumber> or chcp <codepagenumber>. This will change the codepage only for the current instance of cmd.exe.

Here is a short list of some typical codepages and their numbers:

DOS Latin1    850
DOS Latin2    852
Windows-1250  1250
Windows-1252  1252
UTF-8         65001
ISO-8859-1    28591
ISO-8859-2    28592

As @Christophe Weis pointed out in comments, you can lookup the identifiers of other code pages at Code page identifiers page.