Groovy XmlSlurper parse mixed text and nodes

I'm currently trying to parse a node in groovy which contains mixed text and nodes with text and I need to get the text in the right order for example:

<?xml version="1.0" encoding="UTF-8"?>
<root>
   <p>
      The text has
      <strong>nodes</strong>
      which need to get parsed
   </p>
</root>

Now I want it to parse so I get the whole text but can still edit the node. In this example I want the result:

The text has <b>nodes</b> which need to get parsed

If I could just get a list of all elements under the p where I can test if its a node or text I would be happy, but I cant find any way to get that.

标签： xml groovy xml-parsing xmlslurper

3条回答

狗以群分

2楼-- · 2019-07-23 14:03

Here You have working example:

def txt = '''
<root>
   <p>
      <![CDATA[The text has <strong>nodes</strong> which need to get parsed]]>
   </p>
</root>
'''
def parsed = new XmlSlurper(false,false).parseText(txt)
assert parsed.p[0].text().trim() == 'The text has <strong>nodes</strong> which need to get parsed'

I guess it's impossible to do without CDATA tag.

0人赞添加讨论(0) 举报

太酷不给撩

3楼-- · 2019-07-23 14:21

You can use XmlUtil and XmlParser like so:

import groovy.xml.*

def xml = '''<?xml version="1.0" encoding="UTF-8"?>
<root>
   <p>
      The text has
      <strong>nodes</strong>
      which need to get parsed
   </p>
</root>'''

println XmlUtil.serialize(new XmlParser().parseText(xml).p[0])

0人赞添加讨论(0) 举报

Deceive 欺骗

4楼-- · 2019-07-23 14:23

ok, I found a solution I can use without any (tricky) workarounds. The thing ist, a NodeChild doesn't have a Method that gives you both child text and child nodes but a Node does. To get one simply use childNodes() (because the slurper gives you a NodeChild)

def root = new XmlSlurper().parse(xml)

    root.childNodes().each { target ->

        for (s in target.children()) {

            if (s instanceof groovy.util.slurpersupport.Node) {
                println "Node: "+ s.text()
            } else {
                println "Text: "+ s
            }
        }
    }

This gives me the result:

Text: The text has
Node: nodes
Text: which need to get parsed

Which means I can easily do whatever I want with my Nodes while they are still in the right order with the text

0人赞添加讨论(0) 举报

Groovy XmlSlurper parse mixed text and nodes

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间