creating pdf with itextsharp with images from data

2019-04-11 20:00发布

问题:

I have a process where the html is stored in database with image links. the images are also stored in db as well. I've created a controller action which reads the image from database. the path I'm generating is something like /File/Image?path=Root/test.jpg. this image path is embedded in html in img tag like <img alt="logo" src="/File/Image?path=Root/001.jpg" />

I'm trying to use itextsharp to read the html from the database and create a pdf document

string _html = GenerateDocumentHelpers.CommissioningSheet(fleetId);
string _html = GenerateDocumentHelpers.CommissioningSheet(fleetId);
Document _document = new Document(PageSize.A4, 80, 50, 30, 65);
MemoryStream _memStream = new MemoryStream();
PdfWriter _writer = PdfWriter.GetInstance(_document, _memStream);
StringReader _reader = new StringReader(_html);            
HTMLWorker _worker = new HTMLWorker(_document);
_document.Open();            
_worker.Parse(_reader);
_document.Close();
Response.Clear();
Response.AddHeader("content-disposition", "attachment; filename=Commissioning.pdf");
Response.ContentType = "application/pdf";
Response.Buffer = true;
Response.OutputStream.Write(_memStream.GetBuffer(), 0, _memStream.GetBuffer().Length);
Response.OutputStream.Flush();
Response.End();
return new FileStreamResult(Response.OutputStream, "application/pdf");

This code gives me an illegal character error. this comes from the image tag, it is not recognizing ? and = characters, is there a way I can render this html with img tag so that when I create a pdf it renders the html and image from the database and creates a pdf or if itextsharp can't do it, can you provide me with any other third party open source tools that can accomplish this task?

回答1:

If the image source isn't a fully qualified URL including protocol then iTextSharp assumes that it is a file-based URL. The solution is to just convert all image links to absolute in the form http://YOUR_DOMAIN/File/Image?path=Root/001.jpg.

You can also set a global property on the parser that works pretty much the same as the HTML <BASE> tag:

//Create a provider collection to set various processing properties
System.Collections.Generic.Dictionary<string, object> providers = new System.Collections.Generic.Dictionary<string, object>();
//Set the image base. This will be prepended to the SRC so watch your forward slashes
providers.Add(HTMLWorker.IMG_BASEURL, "http://YOUR_DOMAIN");
//Bind the providers to the worker
worker.SetProviders(providers);
worker.Parse(reader);

Below is a full working C# 2010 WinForms app targeting iTextSharp 5.1.2.0 that shows how to use a relative image and set its base using the global provider. Everything is pretty much the same as your code, although I through in a bunch of using statements to ensure proper cleanup. Make sure to watch the leading and trailing forward slashes on everything, the base URL gets prepended directly only the SRC attribute and you might end up with double-slashes if its not done correctly. I'm hard-balling a domain in here but you should be able to easily use the System.Web.HttpContext.Current.Request object.

using System;
using System.IO;
using System.Windows.Forms;
using iTextSharp.text;
using iTextSharp.text.html.simpleparser;
using iTextSharp.text.pdf;

namespace WindowsFormsApplication1
{
    public partial class Form1 : Form
    {
        public Form1()
        {
            InitializeComponent();
        }

        private void Form1_Load(object sender, EventArgs e)
        {

            string html = @"<img src=""/images/home_mississippi.jpg"" />";
            string outputFile = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "HtmlTest.pdf");
            using (FileStream fs = new FileStream(outputFile, FileMode.Create, FileAccess.Write, FileShare.None)) {
                using (Document doc = new Document(PageSize.TABLOID)) {
                    using (PdfWriter writer = PdfWriter.GetInstance(doc, fs)) {
                        doc.Open();

                        using (StringReader reader = new StringReader(html)) {
                            using (HTMLWorker worker = new HTMLWorker(doc)) {
                                //Create a provider collection to set various processing properties
                                System.Collections.Generic.Dictionary<string, object> providers = new System.Collections.Generic.Dictionary<string, object>();
                                //Set the image base. This will be prepended to the SRC so watch your forward slashes
                                providers.Add(HTMLWorker.IMG_BASEURL, "http://www.vendiadvertising.com");
                                //Bind the providers to the worker
                                worker.SetProviders(providers);
                                worker.Parse(reader);
                            }
                        }

                        doc.Close();
                    }
                }
            }

            this.Close();
        }
    }
}