I have an input file in a defined encoding (utf8) from which I create different files whose names and content (again utf8) are taken form that input file.
My problem is that one a particular windows system, the files created do not have the correct characters. The content of these files is perfectly readable, but their names not.
Instead of Ü.xml
, the file has the name Ü.xml
.
On other Windows systems everything works fine.
The file content's encoding can be set in OutputStreamWriter
's second argument, but the file name's encoding can not be set in new File(name)
is seems.
Thanks.
Seeing two chars where there should be one UTF-8 multi-byte char
ü
. that Windows does not seem to have UTF-8 as file encoding. And a UTF-8 file was copied onto that system, like unpacking a zip file.System.getProperty("file.encoding")
should give the platform encoding. Maybe, remotely imaginable, it is some odd case not covered by Java resp. Windows, like a compressed directory, or a second external disk formatted with a non-UTF-8 capable file system.Java uses the "platform's default charset" to convert file names to strings, and there's no way to change that behaviour through the standard API. You may, on some systems, be able to change the default encoding when you launch the JVM:
On other systems the only way to affect the file name encoding is through the system locale settings. You can read more about that here: http://jonisalonen.com/2012/java-and-file-names-with-invalid-characters/