Im trying to scrape data from website: http://uk.investing.com/rates-bonds/financial-futures via vba, like real-time price, i.e. German 5 YR Bobl, US 30Y T-Bond, i have tried excel web query but it only scrapes the whole website, but I would like to scrape the rate only, is there a way of doing this?
相关问题
- Scraping all mobiles of Flipkart.com
- Excel sunburst chart: Some labels missing
- Selenium in Java is not finding element when using
- How to filter out nodes with rvest?
- Error handling only works once
相关文章
- Get column data by Column name and sheet name
- Is there a google API to read cached content? [clo
- programmatically excel cells to be auto fit width
- Unregister a XLL in Excel (VBA)
- Unregister a XLL in Excel (VBA)
- Parsing complex HTML tables
- How to prevent excel from truncating numbers in a
- numeric up down control in vba
This question asked long before. But I thought following information will useful for newbies. Actually you can easily get the values from class name like this.
And if you are new to web scraping please read this blog post.
Web Scraping - Basics
And also there are various techniques to extract data from web pages. This article explain few of them with examples.
Web Scraping - Collecting Data From a Webpage
you can use winhttprequest object instead of internet explorer as it's good to load data excluding pictures n advertisement instead of downloading full webpage including advertisement n pictures those make internet explorer object heavy compare to winhttpRequest object.
Other methods were mentioned so let us please acknowledge that, at the time of writing, we are in the 21st century. Let's park the local bus browser opening, and fly with an XMLHTTP GET request (XHR GET for short).
Wiki moment:
It's a fast method for retrieving data that doesn't require opening a browser. The server response can be read into an HTMLDocument and the process of grabbing the table continued from there.
In the below code, the table is grabbed by its id
cr1
.In the helper sub,
WriteTable
, we loop the columns (td
tags) and then the table rows (tr
tags), and finally traverse the length of each table row, table cell by table cell. As we only want data from columns 1 and 8, aSelect Case
statement is used specify what is written out to the sheet.Sample webpage view:
Sample code output:
VBA:
I modified some thing that were poping up error for me and end up with this which worked great to extract the data as I needed:
There are several ways of doing this. This is an answer that I write hoping that all the basics of Internet Explorer automation will be found when browsing for the keywords "scraping data from website", but remember that nothing's worth as your own research (if you don't want to stick to pre-written codes that you're not able to customize).
Please note that this is one way, that I don't prefer in terms of performance (since it depends on the browser speed) but that is good to understand the rationale behind Internet automation.
1) If I need to browse the web, I need a browser! So I create an Internet Explorer browser:
2) I ask the browser to browse the target webpage. Through the use of the property ".Visible", I decide if I want to see the browser doing its job or not. When building the code is nice to have
Visible = True
, but when the code is working for scraping data is nice not to see it everytime soVisible = False
.3) The webpage will need some time to load. So, I will wait meanwhile it's busy...
4) Well, now the page is loaded. Let's say that I want to scrape the change of the US30Y T-Bond: What I will do is just clicking F12 on Internet Explorer to see the webpage's code, and hence using the pointer (in red circle) I will click on the element that I want to scrape to see how can I reach my purpose.
5) What I should do is straight-forward. First of all, I will get by the ID property the
tr
element which is containing the value:Here I will get a collection of
td
elements (specifically,tr
is a row of data, and thetd
are its cells. We are looking for the 8th, so I will write:Why did I write 7 instead of 8? Because the collections of cells starts from 0, so the index of the 8th element is 7 (8-1). Shortly analysing this line of code:
.Cells()
makes me access thetd
elements;innerHTML
is the property of the cell containing the value we look for.Once we have our value, which is now stored into the
myValue
variable, we can just close the IE browser and releasing the memory by setting it to Nothing:Well, now you have your value and you can do whatever you want with it: put it into a cell (
Range("A1").Value = myValue
), or into a label of a form (Me.label1.Text = myValue
).I'd just like to point you out that this is not how StackOverflow works: here you post questions about specific coding problems, but you should make your own search first. The reason why I'm answering a question which is not showing too much research effort is just that I see it asked several times and, back to the time when I learned how to do this, I remember that I would have liked having some better support to get started with. So I hope that this answer, which is just a "study input" and not at all the best/most complete solution, can be a support for next user having your same problem. Because I have learned how to program thanks to this community, and I like to think that you and other beginners might use my input to discover the beautiful world of programming.
Enjoy your practice ;)