I am hoping someone can help....
I have 8000 values in a excel spreadsheet that I need to search in a website and then record a specific line of data from the website to be inputted back into the excel spreadsheet.
I have found a previous post which searches for the data I am looking at excel macro to search a website and extract results
with the code being;
Sub URL_Get_ABN_Query()
strSearch = Range("a1")
With ActiveSheet.QueryTables.Add( _
Connection:="URL;http://www.abr.business.gov.au/SearchByABN.aspx?SearchText=" & strSearch & "&safe=active", _
Destination:=Range("a5"))
.BackgroundQuery = True
.TablesOnlyFromHTML = True
.Refresh BackgroundQuery:=False
.SaveData = True
End With
`enter code here`
End Sub
However, when I run then Macro in Excel it collects all the data from the website like this.
I only want the 'entity type' data line to be inputted. I have searched everywhere and can not seem to find how to extend the code to only grab this line of information and input to the corresponding cell (i.e. ABN(b2)search, find input 'entity type' and paste into Company Type(c2).
Alternatively, I have also tried to find how to get the macro to fill the information vertically instead of horizontally as then I could delete the columns that are not needed, I thought this may be a simpler way to run this macro but alas again I could not find help to do it. I also tried to record the macro with developer but that did not work either.
I also need to loop it to run the next ABN and populate the corresponding field and so on (B3>C3, B4>C4, etc etc).
I would love some help figuring, I am a beginner in VBA and think what I want to do is beyond my skill level at this time. I am trying to understand through tutorials, google searches and help pages but can not seem to find how or if this can be done.
My alternative is to do this manually for each of the 8000 data points, copying each abn, searching in the website and then copying the entity type and pasting into excel, I did try this first but after a while started searching for a better way. Can you help????
You do not want a load of connections (queryTables) set up in this way. It will be so slow if even possible. At 8000 requests, provided xmlhttp is not blocked or throttled, the below method will be significantly faster. If there does appear to be slowing/blocking then add in a small wait every x requests.
If possible use xmlhttp to gather data. Use css selectors to specifically target the entity type. Store values in an array and write out with loop at end. Use a class to hold the xmlhttp object for greater efficiency. Provide your class with methods including how to handle not found (example given). Add some further optimizations e.g. given is switching off screen-updating. This assumes your search numbers are in column B from B2. The code below also does some basic checks that there is something present in column B and handles the case of there being 1 or more numbers.
Good code is modular and you want a function to return something and a sub to perform actions. A single sub/function shouldn't complete lots of tasks. You want to easily debug with code that follows the principle of single responsibility (or close to it).
class clsHTTP
Standard module:
References (VBE> Tools > References):
CSS selectors:
I use the fact the entity description is a hyperlink (
a
tag) and that its value contains the stringEntityTypeDescription
to use a css attribute = value with contains (*) operator to target.This is absolutely possible. You've got what I often find the hardest part, sourcing the information from another platform. To make this work I would separate it out a little bit and for simplicity use 2 sheets (Sheet1 with your known data and Sheet2 for the web data).
Loop through your table of ~8000 businesses. We can identify this from the UsedRange number of Rows. We know that the ABN is in column 2 (also known as B) so we copy that into the variable to pass to the function. The function will return the "Entity type:" to column 3 (C) of the same row.
Change the subroutine you created to a Function so it returns the entity type you are after. The function will save the data into Sheet2 and then return just the Entity data that we are after.