pugixml - get all text nodes (PCDATA), not just th

2019-07-07 09:21发布

问题:

Currently, if I try to parse

<parent>
    First bit of text
    <child>
    </child>
    Second bit of text
</parent>

I only get First bit of text with

parent.text().get()

What's the correct way to grab all text nodes in parent?

  1. Is there a nice utility function for this?
  2. How could it be done iterating though all children?

回答1:

There is no function that concatenates all text; if you want to get a list of text node children, you have two options:

  1. XPath query:

    pugi::xpath_node_set ns = parent.select_nodes("text()");
    
    for (size_t i = 0; i < ns.size(); ++i)
        std::cout << ns[i].node().value() << std::endl;
    
  2. Manual iteration w/type checking:

    for (pugi::xml_node child = parent.first_child(); child; child = child.next_sibling())
        if (child.type() == pugi::node_pcdata)
            std::cout << child.value() << std::endl;
    

Note that if you can use C++11 then the second option can be much more concise:

for (pugi::xml_node child: parent.children())
    if (child.type() == pugi::node_pcdata)
        std::cout << child.value() << std::endl;

(of course, you can also use ranged for to iterate through xpath_node_set)



回答2:

In the version of pugixml I have, I can use the print method to get all the inner xml from a node into a stream. E.g.: std::stringstream ss; node.print(ss); return ss.str();



标签: c++ xml pugixml