As you all know, external resources, like images, can be embedded into the html file using base64 encoding:
<img src="data:image/png;base64,iVBORw0KGgoAAAANS..." />
I'm looking for a pure browser-based javascript way to traverse an html page and embed all the external resources into the file so when I say $("html").html()
, it returns all the page's contents. Even including its external resources.
Just so it makes sense, I'm trying to download web pages into single files using a headless browser on my server.
There are tools out there to do that. Examples:
While there are benefits to this approach, remember that a page visited more than once, or site with multiple pages with same JS/CSS files will enjoy client (browser) side caching.