I have an HTML (not XHTML) document that renders fine in Firefox 3 and IE 7. It uses fairly basic CSS to style it and renders fine in HTML.
I'm now after a way of converting it to PDF. I have tried:
- DOMPDF: it had huge problems with tables. I factored out my large nested tables and it helped (before it was just consuming up to 128M of memory then dying--thats my limit on memory in php.ini) but it makes a complete mess of tables and doesn't seem to get images. The tables were just basic stuff with some border styles to add some lines at various points;
- HTML2PDF and HTML2PS: I actually had better luck with this. It rendered some of the images (all the images are Google Chart URLs) and the table formatting was much better but it seemed to have some complexity problem I haven't figured out yet and kept dying with unknown node_type() errors. Not sure where to go from here; and
- Htmldoc: this seems to work fine on basic HTML but has almost no support for CSS whatsoever so you have to do everything in HTML (I didn't realize it was still 2001 in Htmldoc-land...) so it's useless to me.
I tried a Windows app called Html2Pdf Pilot that actually did a pretty decent job but I need something that at a minimum runs on Linux and ideally runs on-demand via PHP on the Webserver.
What am I missing, or how can I resolve this issue?
Have a look at
wkhtmltopdf
. It is open source, based on webkit and free.We wrote a small tutorial here.
EDIT( 2017 ):
If it was to build something today, I wouldn't go that route anymore.
But would use http://pdfkit.org/ instead.
Probably stripping it of all its nodejs dependencies, to run in the browser.
Perhaps you might try and use Tidy before handing the file to the converter. If one of the renderer chokes on some HTML problem (like unclosed tag), it might help it.
I dont think a php class will be the best for render an xHtml page with css.
What happen when a new css rule come out? (soon css 3.0...)
The best way to render an html page is, obvisiuly, a browser. Firefox 3.0 can natively 'print' in pdf format, torisugary developed an extension (command line print) to use it. Here you'll find it.
Anyway, there are still many problmes runninr firefox just as a pdf converter...
At the moment, i think that wkhtmltopdf is the best (that is the one used by the safari browser), fast, quick, awesome. Yes, opensource as well... Give it a look
In terms of cost, using a web-service (API) may in many cases be the more sensible approach. Plus, by outsourcing this process you unburden your own infrastructure/backend and - provided you are using a reputable service - ensure compatibility with adjusting web standards, uptime, short processing times and quick content delivery.
I've done some research on most of the web services currently on the market, please find below the APIs that I feel are worth mentioning on this thread, in an order based on price/value ratio. All of them are offering pre-composed PHP classes and packages.
Quality:
Having the high-quality engine
PrinceXML
as a backbone, DocRaptor clearly offers the best PDF quality, returning highly polished and well converted PDF documents. However, the pdflayer API service gets pretty close here. Pdfcrowd does not necessarily score with quality, but with processing speed.Cost:
pdflayer.com - As indicated above, the most cost-effective option here is pdflayer.com, offering an entirely free subscription plan for 100 monthly PDFs and premium subscriptions ranging between $9.99-$119.99. The price for 10,000 monthly PDF documents is $39.99.
docraptor.com - Offering a 7-Day Free Trial period. Premium subscription plans range from $15-$2250. The price for 10,000 monthly PDF documents is ~ $300.00.
pdfcrowd.com - Offering 100 PDFs once for free. Premium subscription plans range from $9-$89. The price for 10,000 monthly PDF documents is ~ $49.00.
I've used all three of them and this text is supposed to help anyone decide without having to pay for all of them. This text has not been written to endorse any one product and I have no affiliation with any of the products.
Important: Please note that this answer was written in 2009 and it might not be the most cost-effective solution today in 2018. Online alternatives are better today at this than they were back then.
Here are some online services that you can use:
Have a look at PrinceXML.
It's definitely the best HTML/CSS to PDF converter out there, although it's not free (But hey, your programming might not be free either, so if it saves you 10 hours of work, you're home free (since you also need to take into account that the alternative solutions will require you to setup a dedicated server with the right software)
Oh yeah, did I mention that this is the first (and probably only) HTML2PDF solution that does full ACID2 ?
PrinceXML Samples
TCPDF works fine, no dependencies, is free and constantly bugfixed. It has reasonable speed if supplied HTML/CSS contents is well formated. I normally generate from 50 - 300 kB of HTML input (including CSS) and get PDF output within 1-3 secs with 10 - 15 PDF pages.
I strongly recommend using tidy library as HTML pretty formatter before sending anything to TCPDF.