I saw in How to fix iText's text wrapping for chinese characters that another user had a similar problem as what we're facing. A response by https://stackoverflow.com/users/1622493/bruno-lowagie indicated the DefaultSplitCharacter has taken Chinese characters into account since iText 5. We're using iText 5.5.6, but still see the problem.
As near as I can tell, DefaultSplitCharacter is working correctly, but the problem appears to be that the ColumnText class allows lines to begin with these punctuation marks.
Here's a screen shot of the PdfChunks in the BidiLine class being used to render the text
However, the result is being written where the 3rd and 5th lines both begin with punctuation characters as show in this image of the PDF output
I can simply add some new lines in the proper places to make it look correct, but this would mean if the text is ever re-translated internally my fix may no longer work. Does anyone know how to ensure that iText won't begin a line with these punctuation characters?
I'm using iTextSharp. I wrote a ISplitCharacter following k.f.'s sample.
For breaking lines in Asian languages you need to write your own implementation of SplitCharacter. A good reference for line breaking is Unicode® Standard Annex #14 -Unicode Line Breaking Algorithm. Another one is https://msdn.microsoft.com/en-us/library/cc194864.aspx.
Having suffered through implementing this for Japanese, I'm putting example code I wrote for Japanese text mixed with English text. This code could be modified for Chinese fairly easily using the references above.
Here is a snippet showing JapaneseSplitCharacter in use:
Here is the code for JapaneseSplitCharacter:
Hope this helps.