Related to This topic
Problem is as follows, imagine an XML without any particular schema
<persons>
<total>2</total>
<someguy>
<firstname>john</firstname>
<name>Snow</name>
</someguy>
<otherperson>
<sex>female</sex>
</otherperson>
</persons>
For processing I want to have this in a Key Value Map:
"Persons/total" -> 2
"Persons/someguy/firstname" -> john
"Persons/someguy/name" -> Snow
"Persons/otherperson/sex" -> female
Preferably I have some nice recursive function where I traverse the XML code depth-first and simply stack all labels until I find a value and return that value together with the stack of labels. Unfortunately I am struggling to connect the return type with the input type as I return a Sequence of my input.. Let me show you what I have so far, clearly the foreach is a problem as this returns Unit, but the map would also not work as it returns a Seq.
def dfs(n: NodeSeq, keyStack: String, map: Map[String,String])
:(NodeSeq, String, Map[String,String]) = {
n.foreach(x => {
if (x.child.isEmpty) {
dfs(x.child, keyStack, map + (keyStack+ x.label + " " -> x.text))
}
else {
dfs(x.child, keyStack+ x.label + "/", map)
}
}
)
}
Would greatly appreciate the help!
One scenario to consider, is when an element has a prefix:
There are other scenarios to consider (including entities, comments, declarations), but this is a good place to start:
Another scenario to consider is when text isn't at a leaf element:
After some playing around, this is the most elegant way in which I could do it. What I don't like is:
Please improve if you have ideas!
We still need to flatten the Map afterwards but at least the recursive part is rather short. Code above should work in your Scala worksheet.
Inspired by Sparky's answer, but suitable even for more generalized case:
For XML like
it generates a Map[String,List[String]] below (after .toString()):