Tbody tag in xpath produced by fire bug

I'm trying to extract some data from online htmls using ruby hpricot library. I use the firefox extension fire bug to get the xpath of a selected item.

There's always the extra tbody tag present in the produced xpath expression. In some cases, I must remove the tbody tag from the expression to obtain the results while in other cases, I must keep the tag to get the results.

I just can't figure out when to keep the tbody tag and when not to.

标签： ruby xpath firebug hpricot

3条回答

男人必须洒脱

2楼-- · 2020-07-28 07:33

Well with HTML 4 or with XHTML served as text/html the parser always infers a tbody element to wrap tr elements in that are direct children of an table element in the parsed mark up, that is why inside the browser DOM a HTML table always has a tbody containing any tr elements and a tool like Firebug gives you path that works against the Firefox/Mozilla DOM. I don't know what kind of parser your Ruby library uses, perhaps it uses an XML parser for XHTML documents and an XML parser does not infer tbody elements for table elements.

0人赞添加讨论(0) 举报

啃猪蹄的小仙女

3楼-- · 2020-07-28 07:50

HTML5 always adds the tbody element if its not there explicitly - it's part of the repair strategy for dealing with invalid HTML. If you want to cope with a variety of environments, using table//tr might make sense.

0人赞添加讨论(0) 举报

干净又极端

4楼-- · 2020-07-28 07:52

In order to take into account and avoid this problem, use XPath expressions of the following kind:

 /locStep1/locStep2/.../table/YourSubExpression
|
 /locStep1/locStep2/.../table/tbody/YourSubExpression

If the table doesn't have a tbody child, then the second argument of the union operator (|) selects no nodes and the first argument of the union selects the wanted nodes.

Alternatively, if the table has a tbody child, then the first argument of the union operator selects no nodes and the second argument of the union selects the wanted nodes.

The end result: in both cases the wanted nodes are selected

0人赞添加讨论(0) 举报

Tbody tag in xpath produced by fire bug

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间