When PDF.js processes a PDF to HTML5, it lays a <canvas>
over all the <div>
elements containing the text. This canvas is a proper render of the PDF, while the text underneath is quite rough (but sufficient for certain purposes such as searching for words).
Using the PDF.js demo page, I can make the underlying text visible by:
Deleting the <canvas>
element.
Disabling the color: transparent
property on the .textLayer
class, which acts upon the underlying text.
... However, the text remains low-opacity, and I can't find the CSS that's controlling this effect (see below):
Before - with canvas
After - having applied the aforementioned two steps
Is there a way to manually restore the text back to full opacity using JavaScript? Or better yet, is there a special way to invoke PDF.js so that it presents just the underlying text, and discards the canvas entirely (or disables the canvas for all text usages)?
Well, repeating your steps, I
- removed
.textLayer > div { color: transparent; }
,
- added
.pdfViewer .canvasWrapper { display: none; }
- and lastly changed the opacity of the text layer
.textLayer { opacity: 1.0; }
.
The last one did the trick.
To do this programmatically via JS, you could use:
var mainCSS = document.styleSheets[0];
mainCSS.insertRule(".textLayer { opacity: 1.0; }", 1);
mainCSS.insertRule(".textLayer > div { color: initial !important; }", 1);
mainCSS.insertRule(".pdfViewer .canvasWrapper { display: none; }", 1);
The !important
after color: initial
is used to prevent the original CSS definition (color: transparent
) from being applied.
Edit:
To prevent that text is drawn to the canvas, you could disable the functions that are used to draw text (namely fillText
and strokeText
).
CanvasRenderingContext2D.prototype.strokeText = function () { };
CanvasRenderingContext2D.prototype.fillText = function () { };
That way you will not have to modify the code in PDF.js itself.
If you want to preserve the functionality of strokeText
and fillText
you might be willing to adjust the functions showText
and paintChar
(within pdf.js / pdf.worker.js).