JZOS and iText on z/OS

2019-07-27 02:35发布

问题:

I am trying to create a PDF on z/OS using JZOS and iText.

I have tried so many combinations for the font as well as the DefaultPlatformEncoding but I just don't seem to be able to get the Arabic characters to display in the PDF. They display as Latin characters. When I turn the PDF compression off and display the hex characters, I see the EBCDIC hex codes.

The input file on z/OS is IBM-420 and the output PDF should have Cp1256 or Windows-1256 for display on Windows.

Here is the snippet of the code:

// Open the input dataset
ZFile zFilein = new ZFile("//DD:INDS", "rb,type=record,noseek");
// Open the output PDF file
PdfWriter writer = PdfWriter.getInstance(document,
    FileFactory.newBufferedOutputStream("//DD:OUTPDF"));
document.open();
//  Font cf = new Font(Font.FontFamily.COURIER, Font.DEFAULTSIZE, Font.NORMAL);
//  Font cf = FontFactory.getFont("Courier","Cp1256", true);
Font cf = FontFactory.getFont("Arial", BaseFont.IDENTITY_H, true, Font.DEFAULTSIZE, Font.NORMAL);
Paragraph paragraph = new Paragraph();
paragraph.setFont(cf);
String encoding = ZUtil.getDefaultPlatformEncoding();
// String encoding = "Cp1256";
String line = new String(recBuf,1,nRead-1,encoding);
paragraph.add(line);

I tried the following options but still unable to get the PDF to display correctly and also the PDF Font information does not show the font as EMBEDDED. Anything else I missed?

Note: arial.ttf was uploaded from WINDOWS

Option 1

FontFactory.register("arial.ttf");
Font cf = FontFactory.getFont("Arial", 8);

paragraph = new Paragraph(line, cf);

The FONT information in the PDF displays the following:

ArialMT Type: TrueType Encoding: Ansi Actual Font: ArialMT Actual Font Type: TrueType

Option 2

BaseFont bf = BaseFont.createFont(font1, BaseFont.IDENTITY_H, true);     
Font cf = new Font(bf, 10);           
paragraph = new Paragraph(line, cf);   

Viewing the PDF display the following error:
Cannot extract the embedded font 'ZQRNLC+ArialMT'. Some characters may not display or 
print correctly.

Viewing the source of the PDF in an editor I can see the following:
R/FontName/ZQRNLC+ArialMT/         

The FONT in the PDF displays the following information:

ArialMT
            Type: TrueType(CID)
            Encoding: Identity-H
            Actual Font: Unknown

回答1:

Your question is a duplicate of several other questions.

This line of code may be problematic:

Font cf = FontFactory.getFont("Arial", BaseFont.IDENTITY_H, true, Font.DEFAULTSIZE, Font.NORMAL);

This won't give you the font Arial unless you have registered an Arial font program as explained in my answer to the question "Why doesn't FontFactory.GetFont("Known Font Name", floatSize) work?"

I don't know z/OS (never heard of it), but if it uses EBCDIC, it's doing something wrong. Your strings need to be in UNICODE. It is very important that the String values you are using are in the right encoding as explained in my answer to the question "getBytes() doesn't work for Cyrillic letters". In your case, you are reading the content from a file, but there is no guarantee that the encoding you use to read the file matches the encoding that was used to store the file. When you say that the glyphs are shown in EBCDIC, I think that you're mistaken and that you experience the same problem as explained in "getBytes() doesn't work for Cyrillic letters". You really need Unicode.

You say that you want to create text in Arabic, but I don't see you using the RTL setting anywhere. This is explained in my answer to this question: "how to create persian content in pdf using eclipse".

Please download the free ebook "The Best iText Questions on StackOverflow" and you'll notice that all these problems have been asked and answered before. I would like to close your question as a duplicate of how to create persian content in pdf using eclipse, but unfortunately, that answer wasn't accepted and it didn't receive an upvote, so it can't be used as an original question to mark another question as a duplicate.



标签: itext