I'm trying to make a pdf of a web page that is displaying locations on Google Maps. The only problem is that the Javascript isn't quite completing by the time that ABCpdf renders the pdf. It's incomplete. How can I make ABDpdf wait until the javascript is 100% complete before the pdf is rendered. Here is what I've tried so far.
Doc theDoc = new Doc();
string theURL = url;
// Set HTML options
theDoc.HtmlOptions.AddLinks = true;
theDoc.HtmlOptions.UseScript = true;
theDoc.HtmlOptions.PageCacheEnabled = false;
//theDoc.HtmlOptions.Engine = EngineType.Gecko;
// JavaScript is used to extract all links from the page
theDoc.HtmlOptions.OnLoadScript = "var hrefCollection = document.all.tags(\"a\");" +
"var allLinks = \"\";" +
"for(i = 0; i < hrefCollection.length; ++i) {" +
"if (i > 0)" +
" allLinks += \",\";" +
"allLinks += hrefCollection.item(i).href;" +
"};" +
"document.documentElement.abcpdf = allLinks;";
// Array of links - start with base URL
theDoc.HtmlOptions.OnLoadScript = "(function(){window.ABCpdf_go = false; setTimeout(function(){window.ABCpdf_go = true;}, 1000);})();";
ArrayList links = new ArrayList();
links.Add(theURL);
for (int i = 0; i < links.Count; i++)
{
// Stop if we render more than 20 pages
if (theDoc.PageCount > 20)
break;
// Add page
theDoc.Page = theDoc.AddPage();
int theID = theDoc.AddImageUrl(links[i] as string);
// Links from the rendered page
string allLinks = theDoc.HtmlOptions.GetScriptReturn(theID);
string[] newLinks = allLinks.Split(new char[] { ',' });
foreach (string link in newLinks)
{
// Check to see if we allready rendered this page
if (links.BinarySearch(link) < 0)
{
// Skip links inside the page
int pos = link.IndexOf("#");
if (!(pos > 0 && links.BinarySearch(link.Substring(0, pos)) >= 0))
{
if (link.StartsWith(theURL))
{
links.Add(link);
}
}
}
}
// Add other pages
while (true)
{
theDoc.FrameRect();
if (!theDoc.Chainable(theID))
break;
theDoc.Page = theDoc.AddPage();
theID = theDoc.AddImageToChain(theID);
}
}
// Link pages together
theDoc.HtmlOptions.LinkPages();
// Flatten all pages
for (int i = 1; i <= theDoc.PageCount; i++)
{
theDoc.PageNumber = i;
theDoc.Flatten();
}
byte[] theData = theDoc.GetData();
Response.Buffer = false; //new
Response.Clear();
//Response.ContentEncoding = Encoding.Default;
Response.ClearContent(); //new
Response.ClearHeaders(); //new
Response.ContentType = "application/pdf"; //new
Response.AddHeader("Content-Disposition", "attachment; filename=farts");
Response.AddHeader("content-length", theData.Length.ToString());
//Response.ContentType = "application/pdf";
Response.BinaryWrite(theData);
Response.End();
theDoc.Clear();
Try making your script block into a javascript function, and call that function from the
document.ready()
function at the top of your file. I assume you're using jQuery. Theready()
function will ensure all page elements have stabilized before it calls any functions in its body.In my case, we were upgrading v8 to v9 and generating a thumbnail image of a webpage that also required extensive javascript CSS manipulation for object positioning. When we switched to v9, we noticed the objects were duplicated (showing in their original position and the position that they were supposed to be located in after js).
The workaround that I applied was using the RenderDelay and OneStageRender properties to change how the page rendering is handled to PDF. The 500 is ms, so 1/2 second. The bigger culprit seemed to be the OneStageRender. That had to be disabled in order for rendering to handle properly.
I had a very similar problem (rendering Google Visualization as PDF) and here's the trick that I used to partially solve it:
First of all, your JavaScript needs to be executed on
DOMContentLoaded
rather than onload
(you will understand why just in a moment). Next create an empty page that will serve content by a timer (you can just useSystem.Threading.Thread.Sleep
to make the page "wait" for a certain amount of time).Then place a hidden image on the page that you want to render as PDF and that contains JavaScript that needs to be executed before the PDF can be produced. The "src" attribute of an image must have a URL pointing to your timer page (in the following example I specify the delay in milliseconds via query-string):
Notice that I use
visibility: hidden
instead ofdisplay: none
to hide the image. The reason is that some browsers might not start loading the image until it's visible.Now what will happen is that ABCpdf will wait until the image is loaded while your JavaScript will be executing already (because the
DOMContentLoaded
is fired beforeload
which waits until all images are loaded).Of course you cannot predict how much time exactly do you need to execute your JavaScript. Another thing is that if ABCpdf is unable to load page within 15 seconds (default value but I think you can change it) it will throw an exception so be careful when choosing the delay.
Hope this helps.