Store Selenium HTML Source Code into element of ty

2019-09-11 03:02发布

问题:

Is it possible to store the HTML source grabbed with Selenium (using Excel VBA) into a HTMLDocument element? This is an example using Microsoft Internet Controls and Microsoft HTML Object Library to automate Internet Explorer.

Dim IE as InternetExplorer
Dim HTML as HTMLDocument
Set IE = New InternetExplorer
ie.navigate "www.google.com"
set HTML = IE.Document

can the same be dome with Selenium? For example something like (not working!):

Dim selenium As SeleniumWrapper.WebDriver
Set selenium = New SeleniumWrapper.WebDriver
Dim html as HTMLDocument

selenium.Start "firefox", "about:blank"
selenium.Open "file:///D:/webpages/LE_1001.htm"
Set html = selenium.getHtmlSource 'this is not working since .getHtmlSource() 
                                 'returns a String object but is there a way to store 
                                 'this html source into a type of HTMLDocument-element

回答1:

The proper way to get the DOM with SeleniumBasic:

Sub Get_DOM()
  Dim driver As New FirefoxDriver
  driver.Get "https://en.wikipedia.org/wiki/Main_Page"

  Dim html As New HTMLDocument  ' Requires Microsoft HTML Library
  html.body.innerHTML = driver.ExecuteScript("return document.body.innerHTML;")

  Debug.Print html.body.innerText

  driver.Quit
End Sub

To get the latest version in date working with the above example: https://github.com/florentbr/SeleniumBasic/releases/latest



回答2:

This should work to use a string as the source for an HTML document:

Set html = New HTMLDocument
html.body.innerHTML = selenium.pageSource

edit: changed Selenium call to pageSource from getHtmlSource. Full working code as follows. Not sure that we're using the same version of Selenium though:

Option Explicit

Sub foo()

Dim sel As selenium.WebDriver
Set sel = New selenium.WebDriver
Dim html As HTMLDocument

sel.Start "firefox", "about:blank"
sel.Get "http://www.google.com/"

Set html = New HTMLDocument
html.body.innerHTML = sel.PageSource

Debug.Print html.body.innerText

End Sub

with references to Microsoft HTML Object Library and Selenium Type Library (Selenium32.tlb) - using SeleniumBasic version 2.0.6.0



回答3:

Not quite sure why you prefer converting an Selenium element to a HTMLDocument. It'd require one more bounded dependency to your project.

Personally I prefer allocating DOM-element to a WebElement. For instance:

If (Selenium.FindElementsByClass("qty").Count > 0) Then
    Dim qtyElement as WebElement: Set qtyElement = Selenium.FindElementByClass("qty")
End If

If (Not qtyElement is Nothing) then
    Dim qtyHtml as String: qtyHtml = qrtElement.Attribute("innerHTML")
End if

Debug.Print qtyHtml