How does one prevent text from breaking across pag

2019-05-27 09:52发布

问题:

I am using itext7 version 7.1.2 and itext7.pdfhtml version 2.0.2 to produce a PDF from some HTML containing elements which must not break across pages (e.g. graphs and their accompanying text).

I have tried using explicit page breaks (as was used successfully in our legacy iTextSharp solution (using page-break-before: always on any elements containing elements which should not be separated)) but these don't work at all so tried using the more preferable page-break-inside: avoid as a style on the element containing the elements which I did not want to break across multiple pages. Here is a simplified version of the code which outputs the inline HTML as a PDF in your "My Documents" path...

using iText.Html2pdf;
using iText.Kernel.Geom;
using iText.Kernel.Pdf;
using iText.Layout;
using iText.Layout.Element;
using System;
using System.Linq;

namespace IText7Html2PdfPageBreakTester
{
    internal class Program
    {
        private static void Main(string[] args)
        {
            var html = @"<html>
    <head>
    </head>
    <body>
        <div style=""font-size: 60pt"">
            Some Initial Text.
        </div>
        <div style=""page-break-inside: avoid; font-size: 120pt"">
            This text should all be on the same page.
        </div>
    </body>
</html>";
            var pdfFilePath = System.IO.Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.MyDocuments), "Example PDF.pdf");

            Console.WriteLine($"Converting example HTML to PDF and writing the PDF to: \"{pdfFilePath}\".");

            using (var pdfWriter = new PdfWriter(pdfFilePath))
            {
                using (var pdfDocument = new PdfDocument(pdfWriter))
                {
                    var converterProperties = new ConverterProperties();

                    pdfDocument.SetDefaultPageSize(PageSize.A4);

                    using (var document = new Document(pdfDocument))
                    {
                        //NOTE: If this line is commented then the "page-break-inside: avoid" style behaves as expected.
                        document.SetMargins(40, 40, 40, 40);

                        foreach (var element in HtmlConverter.ConvertToElements(html, converterProperties).OfType<IBlockElement>())
                            document.Add(element);
                    }
                }
            }

            Console.WriteLine($"PDF written to: \"{pdfFilePath}\".");
        }
    }
}

Note that I was able to achieve the desired behaviour if no margins were set on the document; however, it is a business requirement that margins are set on the document so how can I both have these margins set and keep the page-break-inside: avoid behaviour?

I have also tried creating a custom ITagWorker to interpret a custom <pageBreak/> element I tried using instead as a workaround but was having no luck there getting the ProcessorContext.GetPdfDocument().AddNewPage() method to actually add a page.

Supplement: if you substitute the html variable with the following you can see that neither page-break-before: always nor page-break-after: always work as expected regardless of whether margins have been set on the document.

var html = @"<html>
            <head>
            </head>
            <body>
                <div style=""page-break-after: always"">
                    Some Initial Text.
                </div>
                <div>
                    This text should be on a new page.
                </div>
                <div style=""page-break-before: always; font-size: 60pt"">
                    This text should be on a further new page.
                </div>
                <div style=""page-break-inside: avoid; font-size: 120pt"">
                    This text should all be on the same page.
                </div>
            </body>
        </html>";
标签: c# html css itext7