Is it possible to store the HTML source grabbed with Selenium (using Excel VBA) into a HTMLDocument
element?
This is an example using Microsoft Internet Controls
and Microsoft HTML Object Library
to automate Internet Explorer.
Dim IE as InternetExplorer
Dim HTML as HTMLDocument
Set IE = New InternetExplorer
ie.navigate "www.google.com"
set HTML = IE.Document
can the same be dome with Selenium? For example something like (not working!):
Dim selenium As SeleniumWrapper.WebDriver
Set selenium = New SeleniumWrapper.WebDriver
Dim html as HTMLDocument
selenium.Start "firefox", "about:blank"
selenium.Open "file:///D:/webpages/LE_1001.htm"
Set html = selenium.getHtmlSource 'this is not working since .getHtmlSource()
'returns a String object but is there a way to store
'this html source into a type of HTMLDocument-element
The proper way to get the DOM with SeleniumBasic:
Sub Get_DOM()
Dim driver As New FirefoxDriver
driver.Get "https://en.wikipedia.org/wiki/Main_Page"
Dim html As New HTMLDocument ' Requires Microsoft HTML Library
html.body.innerHTML = driver.ExecuteScript("return document.body.innerHTML;")
Debug.Print html.body.innerText
driver.Quit
End Sub
To get the latest version in date working with the above example:
https://github.com/florentbr/SeleniumBasic/releases/latest
This should work to use a string as the source for an HTML document:
Set html = New HTMLDocument
html.body.innerHTML = selenium.pageSource
edit: changed Selenium call to pageSource from getHtmlSource. Full working code as follows. Not sure that we're using the same version of Selenium though:
Option Explicit
Sub foo()
Dim sel As selenium.WebDriver
Set sel = New selenium.WebDriver
Dim html As HTMLDocument
sel.Start "firefox", "about:blank"
sel.Get "http://www.google.com/"
Set html = New HTMLDocument
html.body.innerHTML = sel.PageSource
Debug.Print html.body.innerText
End Sub
with references to Microsoft HTML Object Library and Selenium Type Library (Selenium32.tlb) - using SeleniumBasic version 2.0.6.0
Not quite sure why you prefer converting an Selenium element to a HTMLDocument. It'd require one more bounded dependency to your project.
Personally I prefer allocating DOM-element to a WebElement. For instance:
If (Selenium.FindElementsByClass("qty").Count > 0) Then
Dim qtyElement as WebElement: Set qtyElement = Selenium.FindElementByClass("qty")
End If
If (Not qtyElement is Nothing) then
Dim qtyHtml as String: qtyHtml = qrtElement.Attribute("innerHTML")
End if
Debug.Print qtyHtml