I am trying to scrape a site with this code to extract names and contacts ...
Sub Test()
Dim htmlDoc As Object
Dim htmlDoc2 As Object
Dim elem As Variant
Dim tag As Variant
Dim dns As String
Dim pageSource As String
Dim pageSource2 As String
Dim url As String
Dim row As Long
row = 2
dns = "https://www.zillow.com/detroit-mi/real-estate-agent-reviews/"
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", dns, True
.send
While .readyState <> 4: DoEvents: Wend
If .statusText <> "OK" Then
MsgBox "ERROR" & .Status & " - " & .statusText, vbExclamation
Exit Sub
End If
pageSource = .responseText
End With
Set htmlDoc = CreateObject("htmlfile")
htmlDoc.body.innerHTML = pageSource
Dim xx 'Got error here Set xx = htmlDoc.getElementsByClassName("ldb-contact-summary")
Set htmlDoc = Nothing
Set htmlDoc2 = Nothing
End Sub
When trying to use this line
Set xx = htmlDoc.getElementsByClassName("ldb-contact-summary")
I got an error 'Object doesn't support that property or method' (438) Can you help me please as I am not so good at scraping issues?
As you mention all the pages in the comment above I will use a class to hold the XMLHTTP object and provide it with methods to extract the data, whilst incorporating a method to find the number of results pages and loop them. Testing this gave me 251 rows of results.
Note: Discovered through debugging that keeping the SetRequestHeader was causing, for you, requests for human verification. Removing this meant the XMLHTTP method worked. It worked with and without for me.
Class clsHTTP
Standard module 1
Selenium:
To get the names and their corresponding phone numbers, you can try the below snippet:
Reference to add to the library to execute the above script: