可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I am new to XPath, but I can see how powerful it is. I am looking at the source code of this link and simply want to extract the contents and username from the following two pieces of the page, which for simplicity sake are located near the top of the source code.

content="[Archive] Simburgur's Live Stream [Offline] Gears of War 3"

<div class="username">Simburgur</div>

Here is my code within R:

doc <- htmlParse("http://forums.epicgames.com/archive/index.php/t-672775.html")
xpathSApply(doc, "//head/meta[@name=\"description\"]")

which returns

[[1]]
<meta name="description" content="[Archive]  Simburgur's Live Stream [Offline] Gears of War 3" />

Obviously, in this example, all I want is what is inside the quotes of content= but am stuck and can not seem to get my expression to return the string I want.

I repeat. I am new to XPath. :)

回答1:

Use:

/*/head/meta[@name='description']/@content

This still selects an attribute node, but probably there is an easy way in your PL to get the string value of the attribute.

To get just the string value, use:

string(/*/head/meta[@name='description']/@content)

Do note: Using the // abbreviation may result in very slow evaluation of the XPath expression, because it may cause a linear traversal of a whole (sub)tree.

Always avoid using // if the structure of the XML document is statically known .

回答2:

You're close. This should do it.

//head/meta[@name=\"description\"]/@content

The brackets are constraining the choice of meta tags, but you still have to specify the attribute you want.

XPath within R using XML package

问题:

回答1:

回答2:

收藏的人(0)

XPath within R using XML package

问题:

回答1:

回答2:

收藏的人(0)

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮