XSSFCell
seems to encode certain character sequences as unicode characters. How can I prevent this? Do I need to apply some kind of character escaping?
e.g.
cell.setCellValue("LUS_BO_WP_x24B8_AI"); // The cell value now is „LUS_BO_WPⒸAI"
In Unicode Ⓒ
is U+24B8
I've already tried setting an ANSI font and setting the cell type to string.
This character conversion is done in XSSFRichTextString.utfDecode()
I have now written a function that basicaly does the same thing in reverse.
private static final Pattern utfPtrn = Pattern.compile("_(x[0-9A-F]{4}_)");
private static final String UNICODE_CHARACTER_LOW_LINE = "_x005F_";
public static String escape(final String value) {
if(value == null) return null;
StringBuffer buf = new StringBuffer();
Matcher m = utfPtrn.matcher(value);
int idx = 0;
while(m.find()) {
int pos = m.start();
if( pos > idx) {
buf.append(value.substring(idx, pos));
}
buf.append(UNICODE_CHARACTER_LOW_LINE + m.group(1));
idx = m.end();
}
buf.append(value.substring(idx));
return buf.toString();
}