I used this Code, to Download a Webpage with an Image in the css file, but it is scaled too big in the pdf File.
The Picture has 120 Pixel width and is shown 185 Pixel width.
For Camparison, I Build in a 10 Pixel Line, which is shown with 12 Pixel.
But why? and how can I solve it?
I can't post the Question without further Details, but I don't know anything, which is usefull for you to know, but Maybe it helps:
I work in a archiving Department, with the Order to archive Informations which are accessible in their Webpage, but at the end, the Look just isn't how the Creators want it to be.
I tried to use a local CSS File instead, but realised, that it is not what I want.
I have to use my workflow more often and have to take the css Files they offer and just convert the html Page correctly to PDF.
Thanks in advance for reading,
I tried to make the Code contain everything what could be important to know but Nothing more.
The Dependencies of the Project are:
- com.itextpdf kernel 7.1.7
- com.itextpdf styled-xml-parser 7.1.7
- com.itextpdf svg 7.1.7
- com.itextpdf pdfa 7.1.7
- org.slf4j-simple 1.6.1
package ueberordnungen;
import java.io.IOException;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import com.itextpdf.html2pdf.ConverterProperties;
import com.itextpdf.html2pdf.HtmlConverter;
import com.itextpdf.kernel.pdf.PdfDocument;
import com.itextpdf.kernel.pdf.PdfWriter;
public class Worker3 {
public static void main(String[] args) throws IOException {
//eine spezielle URL heraus picken
String kongressURL = "https://www.egms.de/dynamic/de/meetings/vnda2019/index.htm";
Document doc = Jsoup.connect(kongressURL).get();
System.out.println("-----Titel: "+ doc.title());
Element content = doc.child(0);
content.getElementById("navigation_language").remove();
content.getElementById("navigation").remove();
content.getElementsByAttributeValue("href", "/static/css/gms-framework.css").first().remove();
content.getElementsByClass("hidden_navigation").first().remove();
content.getElementById("page").before(content.getElementById("header"));
content.getElementsByTag("script").remove();
content.getElementById("owner_links_container").attr("style", "border-top:10px solid #060");
ConverterProperties properties = new ConverterProperties();
properties.setBaseUri(kongressURL);
PdfWriter writer = new PdfWriter("content.pdf");
HtmlConverter.convertToPdf(content.html(), new PdfDocument(writer), properties);
}
}
iText 7 adds background images at a scale of one image pixel per pt, see AbstractRenderer.drawBackground
:
PdfXObject backgroundXObject = backgroundImage.getImage();
...
Rectangle imageRectangle = new Rectangle(backgroundArea.getX(), backgroundArea.getTop() - backgroundXObject.getHeight(),
backgroundXObject.getWidth(), backgroundXObject.getHeight());
...
drawContext.getCanvas().addXObject(backgroundXObject, imageRectangle);
As you see in the code, the width and height values of the image (which contain the horizontal and vertical number of pixels of the bitmap image) are used as is as width and height of a rectangle into which the image eventually is scaled. As the units used in canvas drawing operations are user space units which default to 1/72in, the image is displayed at 72 image pixels per inch or 1 image pixel per pt.
Web browsers usually display images by default at 1 image pixel per px or 96 image pixels per inch.
Your example web page mostly is layout'ed using absolute positions given in px = 1/96in. Thus, the different scale at which the images are drawn by a web browser or by iText result in different appearances, in particular not a pleasing appearance in iText in the case at hand:
In Chrome:
in iText:
You can make iText draw background images more like the browsers by replacing the center code line above calculating imageRectangle
by
Rectangle imageRectangle = new Rectangle(backgroundArea.getX(), backgroundArea.getTop() - backgroundXObject.getHeight(),
backgroundXObject.getWidth() * .75f, backgroundXObject.getHeight() * .75f);
Actually this code place appears to be the appropriate place to also start adding support for background-size
which currently is not supported here.
Beware: I'm not really deep into iText 7 HTML to PDF conversion code, so I cannot really tell whether this patch has undesirable side effects.
Is this a bug?
Strictly speaking it isn't, at least as far as I skimmed through the CSS specification:
The HTML page does not set a background-size
here. Thus, the intrinsic size of the background image shall be used. Unfortunately, though, CSS does not define how the intrinsic dimensions are found in general. Thus, the web page essentially left the scale of the background image to the whim of the HTML client...
If iText 7 HTML to PDF aims at producing results in line with browser outputs, though, it had better change its default scale here to match that of those browsers.
I just realized that the AbstractRenderer
I patched is not in the html2pdf project but instead in the core iText 7 layout project.
Thus, changing the size here probably is a bad idea, at least if one uses iText 7 not only for html2pdf but also directly.
Nonetheless, that code position is appropriate for introducing support for some background-size attribute. html2pdf then could extend the BackgroundApplierUtil
so that it always sets that new core attribute to a value appropriate for creating an appearance in line with what browsers display.