Using =importXML in Google Docs

2020-03-30 05:39发布

问题:

From the website https://www.google.co.uk/finance?cid=704531 I'm trying to import the fields "Range" and "52 week" and the others in that group in the middle. I've looked at XPath and am very confused by it.

In cell A1 I have https://www.google.co.uk/finance?cid=704531.

In cell B1 I have

=importxml(A1,//*[@id="market-data-div"]/div[2]/div[1]/table[1]/tbody/tr[4]/td[1])

But the result is an error.

What's the correct XPath expression for this HTML structure?

回答1:

Dom and the <tbody/>

It seems you created that XPath expression within Firebug or similar developer tools. While HTML does not require <tbody/> tags, DOM does; and those developer tools work on the DOM and thus wrap the table rows in such an element. You can easily see the difference when looking at the page source (fetched by wget or other tools if necessary) and the structure presented by Firebug.

Just omit the /tbody axis step in your expression:

//*[@id='market-data-div']/div[2]/div[1]/table[1]/tr[4]/td[1]

Anyway, I'd rather go for fetching the cells by the description, which is a little weaker regarding changes on the wording, but much more robust to changes on the structure:

//*[@id='market-data-div']//tr[td = 'Vol / Avg.']/td[2]

Quoting the Expression

A second problem is you have to quote the XPath expression in double quotes, so for example use:

=importxml(A1,"//*[@id='market-data-div']//tr[td = 'Vol / Avg.']/td[2]")