How do I get Nokogiri to understand my namespaces?

I have the following XML document:

<samlp:LogoutRequest ID="123456789" Version="2.0" IssueInstant="200904051217">
  <saml:NameID>@NOT_USED@</saml:NameID>
  <samlp:SessionIndex>abcdefg</samlp:SessionIndex>
</samlp:LogoutRequest>

I'd like to get the content of the SessionIndex (that is, 'abcdefg') out of it. I've tried this:

XPATH_QUERY = "LogoutRequest[@ID][@Version='2.0'][IssueInstant]/SessionIndex"
SAML_XMLNS  = 'urn:oasis:names:tc:SAML:2.0:assertion'
SAMLP_XMLNS = 'urn:oasis:names:tc:SAML:2.0:protocol'

require 'nokogiri'
doc = Nokogiri::XML(xml)
doc.xpath(XPATH_QUERY, 'saml' => SAML_XMLNS, 'samlp' => SAMLP_XMLNS)

but I get the following errors:

Nokogiri::XML::SyntaxError: Namespace prefix samlp on LogoutRequest is not defined
Nokogiri::XML::SyntaxError: Namespace prefix saml on NameID is not defined
Nokogiri::XML::SyntaxError: Namespace prefix samlp on SessionIndex is not defined

I've tried adding the namespaces to the XPath query, but that doesn't change anything.

Why can't I convince Nokogiri that the namespaces are valid?

标签： xml ruby xpath nokogiri

2条回答

你好瞎i

2楼-- · 2019-01-24 16:30

I see a two different options for you:

Remove all the namespaces

http://www.rubydoc.info/github/sparklemotion/nokogiri/Nokogiri/XML/Document#remove_namespaces%21-instance_method

Brute force way of doing it. Could lead to problems where there are namespace collisions.
Use collect_namespaces

http://www.rubydoc.info/github/sparklemotion/nokogiri/Nokogiri/XML/Document#collect_namespaces-instance_method

A much better solution. You could use this once to identify the namespaces (say in irb) and hard-code them.

OR

Use it at runtime, and supply it as the second argument to http://www.rubydoc.info/github/sparklemotion/nokogiri/Nokogiri/XML/Node#xpath-instance_method

0人赞添加讨论(0) 举报

成全新的幸福

3楼-- · 2019-01-24 16:33

It doesn't look like the namespaces in this document are correctly declared - there should be xmlns:samlp and xmlns:saml attributes on the root node. In cases like this, Nokogiri essentially ignores the namespaces (as it can't map them to URIs or URNs), so your XPath works if you remove them, i.e.

doc.xpath(XPATH_QUERY)

0人赞添加讨论(0) 举报

How do I get Nokogiri to understand my namespaces?

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间