I have a .properties file with translations in Arabic. I am using it to replace strings in an html file. However, when I start the copy task, it completely corrupts the symbols and I get something like this:
اÙÙزادات
Any idea what's causing this and how I can fix it?
build.xml
<target name="copyAndReplace">
<copy todir="..." overwrite="yes" encoding="UTF-8">
<fileset dir="..." includes="*.html"></fileset>
<filterset>
<filtersfile file="***.properties" />
</filterset>
</copy>
</target>
I see some possible problems:
In Java, Properties files are assumed to have ISO-8859-1 encoding. Even if you're not dealing directly with Java, ant is reading a property file. I've run into this problem when opening a property file in Vim and NetBeans editor. Vim saved it in UTF-8 and NetBeans in ISO-8859-1.
You should use the outputencoding
attribute of copy
task. In Windows, UTF-8 is not the default encoding.
i encountered the same issue, but with images.
In the ant manual i found the following remark:
Note: If you employ filters in your copy operation, you should limit the copy to text files. Binary files will be corrupted by the copy operation. This applies whether the filters are implicitly defined by the filter task or explicitly provided to the copy operation as filtersets. See encoding note.
Maybe that is the source of the problem. I will need to check on my own whether this solves my problem.
Kind regards,
Marc
As mentioned by @Jean Waghetti above, ANT expects the files to be ISO-8859-1 encoded. I posted a similar stack overflow question for Chinese characters.
The only solution I've found is by ensuring my .properties file was ISO-8859-1 and the characters were escaped.
For example مرحبا بالعالم
Would be:
\u0645\u0631\u062D\u0628\u0627 \u0628\u0627\u0644\u0639\u0627\u0644\u0645
This is not ideal as it's not terribly human-readable. I have noticed that eclipse automatically converts it on hover.
You can add some code to translate the utf-8 properties to iso-8859-1 properties and the use the converted and escaped properties
<project name="xyz" default="copyAndReplace">
<property name="srcdir" value="src" />
<property name="propdir" value="src" />
<property name="tmpdir" value="tmp" />
<target name="encodeProps">
<script language="javascript">
importPackage(java.io);
importPackage(java.lang);
var files = new java.io.File(propdir).listFiles();
for (var i in files) {
var f = files[i];
if (!f.getName().endsWith(".properties")) continue;
var io = new InputStreamReader(new FileInputStream(f), "utf-8");
var out = new FileOutputStream(new File(tmpdir, f.getName()));
do {
var c = io.read();
if (c == -1) break;
if (c > 127) {
var s = Integer.toHexString(c);
s = new StringBuilder().append("\\u").append("0000".substring(s.length())).append(s).toString();
out.write(s.getBytes());
} else {
out.write(c);
}
} while (true);
io.close();
out.close();
}
</script>
</target>
<target name="copyAndReplace" depends="encodeProps">
<copy todir="dst" overwrite="yes" encoding="UTF-8" filtering="true">
<fileset dir="${srcdir}" includes="*.html">
</fileset>
<filterset>
<filtersfile file="${tmpdir}/c.properties" />
</filterset>
</copy>
</target>
</project>