I am struggling creating a file that contains non-ascii characters.
The following script works fine, if it is called with 0
as parameter but dies when called with 1
.
The error message is open: Invalid argument at C:\temp\filename.pl line 15.
The script is started within cmd.exe
.
I expect it to write a file whose name is either (depending on the paramter) äöü.txt
or äöü☺.txt
. But I fail to create the filename containing a smiley.
use warnings;
use strict;
use Encode 'encode';
# Text is stored in utf8 within *this* file.
use utf8;
my $with_smiley = $ARGV[0];
my $filename = 'äöü' .
($with_smiley ? '☺' : '' ).
'.txt';
open (my $fh, '>', encode('cp1252', $filename)) or die "open: $!";
print $fh "Filename: $filename\n";
close $fh;
I am probably missing something that is obvious to others, but I can't find, so I'd appreciate any pointer towards solving this.
The following runs on Windows 7, ActiveState Perl. It writes "hello there" to a file with hebrew characters in its name:
no need to encode the filename (at least not on linux). This code works on my linux system:
HTH, Paul
First of all, saying "UTF-8 character" is weird. UTF-8 can encode any Unicode character, so the UTF-8 character set is the Unicode character set. That means you want to create file whose name contain Unicode characters, and more specifically, Unicode characters that aren't in cp1252.
I've answered this on PerlMonks in the past. Answer copied below.
Perl treats file names as opaque strings of bytes. That means that file names need to be encoded as per your "locale"'s encoding (ANSI code page).
In Windows, code page
1252
is commonly used, and thus the encoding is usuallycp1252
.* However,cp1252
doesn't support Tamil and Hindi characters [or "☺"].Windows also provides a "Unicode" aka "Wide" interface, but Perl doesn't provide access to it using builtins**. You can use Win32API::File's
CreateFileW
, though. IIRC, you need to still need to encode the file name yourself. If so, you'd useUTF-16le
as the encoding.Aforementioned Win32::Unicode appears to handle some of the dirty work of using Win32API::File for you. I'd also recommend starting with that.
* — The code page is returned (as a number) by the
GetACP
system call. Prepend "cp
" to get the encoding.** — Perl's support for Windows sucks in some respects.