Trying to open a file it states it cannot be found, due to a charset mismatch, when file names have accents. I work using UTF-8 on a linux system (/etc/locales sets UTF-8 as well). Running jboss with -Dfile.encoding=UTF-8 and environment variable JBOSS_ENCODING="UTF-8"
With a JSP I am getting the name of the file :
String fileName = element.getChildText("FileName");
out.println("File to be opened : " + filename);
Displays :
File to be opened : aaaaaà.txt
But, a new File(fileName) won't work. Just file.exists() is false.
Trying to:
File[] files = dir.listFiles();
for (int i=0; i<files.length; i++){
out.println(fileName);
I get : aaaaaà .txt
Why is it reading and trying to open the file taking of the file in HDD as ISO-8859-1? Is it a JBoss config? A java config? How can I force java.io.File to read the file using the UTF-8 as the charset of the file name?
I've used other tools and the name is always read fine, using UTF-8.
(note I'm always talking about the name of the file, never the content, it could be a void file)
I am trying to track down the problem. Here is what I already have:
There is
Exists.java
:And there is
java -version
:Now to the interesting part:
The nice thing is that
strace
works on byte-level, not character-level like Java. So everything is ok in this case. I have the environment variableLANG
set toen_US.UTF-8
, all of theLC_*
variables are unset.Now tracking down the problem to a minimal working example:
That still works. So let's try another encoding:
So this doesn't work. One possible reason might be that I selected a locale that is not in the list printed by
locale -a
. But this shouldn't be the reason for Java to convert the letters to question marks.As soon as LANG points to a non-existing locale, the setting of the
sun.jnu.encoding
property doesn't have any effect anymore. So I'm out of ideas now.Try this:
Java Can't Open a File with Surrogate Unicode Values in the Filename?