I Have The following block in the beginning of my script:
#!/usr/bin/perl5 -w
use strict;
binmode(STDIN, ":utf8");
binmode(STDOUT, ":utf8");
binmode(STDERR, ":utf8");
In some subroutines when there is other encoding(from a distant subroutine), the data will not display correctly, when receiving cyrillic or other characters. It is the "binmode", that causes the problem.
Can I "turn off" the binmode utf8 locally, for the subroutine only?
I can't remove the global binmode setting and I can't change the distant encoding.
One way to achieve this is to "dup" the STD
handle, set the duplicated filehandle to use the :raw
layer, and assign it to a local version of the STD
handle. For example, the following code
binmode(STDOUT, ':utf8');
print(join(', ', PerlIO::get_layers(STDOUT)), "\n");
{
open(my $duped, '>&', STDOUT);
# The ':raw' argument could also be omitted.
binmode($duped, ':raw');
local *STDOUT = $duped;
print(join(', ', PerlIO::get_layers(STDOUT)), "\n");
close($duped);
}
print(join(', ', PerlIO::get_layers(STDOUT)), "\n");
prints
unix, perlio, utf8
unix, perlio
unix, perlio, utf8
on my system.
I like @nwellnhof's approach. Dealing only with Unicode and ASCII - a luxury few enjoy - my instinct would be to leave the bytes as is and selectively make use of Encode
to decode()/encode()
when needed. If you are able to determine which of your data sources are problematic you could filter/insert decode
when dealing with them.
% file koi8r.txt
koi8r.txt: ISO-8859 text
% cat koi8r.txt
������ �� ����� � ������� ���. ���
���� ����� ������ ����� �����.
% perl -CO -MEncode="encode,decode" -E 'decode("koi8-r", <>) ;' koi8-r.txt
Американские суда находятся в международных водах. Япония