During the web scraping I want to save current page's html to a file for later debug. browser.html
helps in most cases, but when the page contains an iframe/frame, it's content is not returned in browser.html
, I have to get it separately with something like browser.iframe.html
There are also cases when inside an iframe is another iframe. I can find every frame recursively and save its content, but separated files won't be very useful because I don't know the exact structure of the page.
For example I have the following page:
<!DOCTYPE html>
<html>
<head>
</head>
<frameset cols="50%,20%,30%">
<frame name="left" src="/html/left_frame.htm" />
<frame name="right" src="/html/right_frame.htm" />
<noframes>
<body>
Your browser does not support frames.
</body>
</noframes>
<frame src="http://example.com"/>
</frameset>
</html>
I want to save it to file using watir. Any ideas?
Frames act much like a completely separate web page, and while you can see the content as it appears in the rendered document and the dom, contents of a frame are not technically part of the html for a page. You can see this in the browser, right click the main doc and view html, then compare that to what you get right clicking content that is in a frame and viewing html.
To write all the html out to files, you are likely going to need to make a method that writes out html of a frame, looks for other frames, and calls the same method recursively on any frames found inside.
Alternativly maybe look at a gem like nokogiri that is designed to parse html, it might have better methods for this sort of thing, or existing examples for how to do what you want