Here is one thing I can't get my head around:
I am using Windows 7 and Strawberry Perl 5.20, and I want to write UTF-8 to the console (cmd.exe) with chcp 65001.
The UTF-8 characters themselves are coming out fine, even >255, but there is a mysterious duplication of some caracters (this only happens if I don't redirect into a file)
EDIT: I now have seen another post that had essentially the same problem at last-octet-repeated-when-my-perl-program-outputs-a-utf-8 -- the solution is to inject a binmode(STDOUT, 'unix:encoding(utf8):crlf') into the perl program -- I have tested and it works fine now
Thanks to everybody who looked into this weird problem.
In a nutshell, I am writing a UTF-8 string (chr(300) x 3).chr(301)."UVW\x{0D}\x{0A}", when I redirect into a flat file and then print the flat file, everything is fine.
However, when I print directly to the console, some characters are mysteriously duplicated (I am talking about the characters "VW" in the seperate line), and I don't know why
Here is my test-output
Page de codes active : 65001
Redirected into a file:
-----------------------
ĬĬĬĭUVW
Printed directly:
-----------------
ĬĬĬĭUVW
VW
IO-Layers = (unix crlf)
C4ACC4ACC4ACC4AD5556570D0A
Here is my test program:
@echo off
chcp 65001
echo.
set H1=BEGIN{binmode(*STDIN); undef $/;
set HEXDUMP="%H1% print uc(unpack('H*',<STDIN>)), qq{\n}}"
set L1=my @l = PerlIO::get_layers(*STDOUT, output, 1);
set LAYERS="%L1% print {*STDERR} qq{IO-Layers = (@l)\n};"
set PROG="print chr(300) x 3, chr(301), qq{UVW\n};";
set TFILE=%TEMP%\tfile.txt
echo Redirected into a file:
echo -----------------------
perl -C6 -e%PROG% >%TFILE% && type %TFILE%
echo.
echo Printed directly:
echo -----------------
perl -C6 -e%PROG%
echo.
perl -e%LAYERS%
echo.
perl -e%HEXDUMP% <%TFILE%
echo.
pause
As I said, the characters themselves are printed correctly, but why is there this mysterious duplication ? ...and why * only * if not redirected into a file ?
As I suspected, this has been reported as a failure in Windows software:
I wasn't aware of a work-around, but if the
:unix:encoding(utf8):crlf
PerlIO stack works for you then it seems you have found one.