Simple Xpath query in scala

2020-06-17 07:37发布

问题:

I'm trying to run a XPath query with scala and it doesn't seem to work. My Xml looks like ( simplified):

<application>
  <process type="output" size ="23"> 
     <channel offset="0"/>
      ....
     <channel offset="4"/>
  </process>
  <process type="input" size ="16"> 
     <channel offset="20"/>
      ....
     <channel offset="24"/>
  </process>
</application>

I want to retrieve the process with the input attribute and for that i use this XPath query:

//process[@type='input']

This should work, i verified it with xpathtester Now, my scala code looks like:

import scala.xml._
val x = XML.loadFile("file.xml")

val process = (x \\ "process[@type='input']")  // will return empty NodeSeq() !!!

The process ends up empty, it does't capture what I want. I worked it around like this:

val process = (x \\ "process" filter( _ \"@type" contains Text("input")))

which is much uglier. Any known reason why my original query shouldn't work?

回答1:

"XPath" should not be used to describe what the Scala standard library supports. XPath is a full-fledged expression language, with so far two final versions and a third in the works:

  • XPath 1.0 from 1999
  • XPath 2.0 from 2007 (2nd edition 2010)
  • XPath 3.0 from 2013 (candidate recommendation)

At best you could say that Scala has a very small subset of XPath-inspired operations. So you can't expect to take XPath expressions and directly paste them to Scala without doing a bit more work.

Third-party libraries can give you better support for actual XPath expressions, including:

  • Scales Xml
    • Scala library
    • "provides a far more XPath like experience than the normal Scala XML, Paths look like XPaths and work like them too (with many of the same functions and axes)"
    • it's still not actual XPath if I understand well
    • designed to integrate well with Scala
  • Saxon
    • Java library
    • open source
    • complete and conformant support for XPath 2 (and XSLT 2)
    • has an XPath API which works on DOM and other data models, but no specific Scala support at this time


回答2:

One way to do that would be to use kantan.xpath:

import kantan.xpath._
import kantan.xpath.implicits._

val input = """
     | <application>
     |   <process type="output" size ="23">
     |      <channel offset="0"/>
     |      <channel offset="4"/>
     |   </process>
     |   <process type="input" size ="16">
     |      <channel offset="20"/>
     |      <channel offset="24"/>
     |   </process>
     | </application>
     | """.stripMargin

val inputs = input.evalXPath[List[Node]](xp"//process[@type='input']")

This yields a List[Node], but you could retrieve values with more interesting types - the list of channel offsets, for example:

input.evalXPath[List[Int]](xp"//process[@type='input']/channel/@offset")
// This yields Success(List(20, 24))


回答3:

I believe the scala xml implementation is not able to process such complicated XPath queries. However it's not difficult to create a small rich wrapper to reduce clutter, e.g. take a look at this thread. With the suggested wrapper you can solve your problem like so:

x \\ "process" \@ ("type", _ == "input")


标签: xml scala xpath