I'm working on getting itext to output my UTF-8 encoded text correctly in fact the input file contains symbols like ° and Latin caracters (é,è,à...) .
But i didn't find a solution this is the code i'm using :
BufferedReader input = null;
Document output = null;
System.out.println("Convert text file to pdf");
System.out.println("input : " + args[0]);
System.out.println("output : " + args[1]);
try {
// text file to convert to pdf as args[0]
input =
new BufferedReader (new FileReader(args[0]));
// letter 8.5x11
// see com.lowagie.text.PageSize for a complete list of page-size constants.
output = new Document(PageSize.LETTER, 40, 40, 40, 40);
// pdf file as args[1]
PdfWriter.getInstance(output, new FileOutputStream (args[1]));
output.open();
output.addAuthor("RealHowTo");
output.addSubject(args[0]);
output.addTitle(args[0]);
BaseFont courier = BaseFont.createFont(BaseFont.COURIER, BaseFont.CP1252, BaseFont.EMBEDDED);
Font font = new Font(courier, 12, Font.NORMAL);
Chunk chunk = new Chunk("",font);
output.add(chunk);
String line = "";
while(null != (line = input.readLine())) {
System.out.println(line);
Paragraph p = new Paragraph(line);
p.setAlignment(Element.ALIGN_JUSTIFIED);
output.add(p);
}
System.out.println("Done.");
output.close();
input.close();
System.exit(0);
}
catch (Exception e) {
e.printStackTrace();
System.exit(1);
}
}
Any idea will be appreciated.
When I look at your code, I see a number of things that are odd.
- You say you require UTF-8, but you create a
BaseFont
object using BaseFont.CP1252
instead of BaseFont.IDENTITY_H
(which is the "encoding" you need when you work with Unicode).
- You use the standard Type 1 font Courier, which is a font that doesn't know how to render é,è,à... and a font that is never embedded. As documented, the
BaseFont.EMBEDDED
parameter is ignored in this case!
- You don't use this font with an object that has actual content. The actual content is put into a
Paragraph
that is created using the default font "Helvetica", a font that doesn't know how to render é,è,à...
To solve this, you need to create the Paragraph
with the appropriate font. That is NOT a standard type 1 font, but something like courier.ttf
. You also need to use the appropriate encoding: BaseFont.IDENTITY_H
.
Both the reader and the writer should be set to use UTF-8 character set encoding to read/write UTF-8 characters properly. For example,
input = new BufferedReader(new InputStreamReader(args[0], "UTF-8"));
@AmiraGL,
The solution proposed by Bruno Lowagie corrected this(p:dataExporter PDF export does not show Euro (€) sign) my problem. It may be that also solves your.
To solve this, you need to create the Paragraph with the appropriate font. That is NOT a standard type 1 font, but something like courier.ttf. You also need to use the appropriate encoding: BaseFont.IDENTITY_H. -by Bruno Lowagie
BaseFont courier = BaseFont.createFont(BaseFont.COURIER, BaseFont.CP1252, BaseFont.EMBEDDED);
Font cellFont = new Font(courier, 12, Font.NORMAL);
Solution: https://stackoverflow.com/a/21259711/3557631