How to parse XML for <![CDATA[]]>

2019-01-15 02:19发布

How to parse a XML having data included in <![CDATA[---]... how can we parse the xml and get the data included in CDATA ???

4条回答
虎瘦雄心在
2楼-- · 2019-01-15 02:39

here r.get().getResponseBody() is the response body

Document doc = getDomElement(r.get().getResponseBody());            
    NodeList nodes = doc.getElementsByTagName("Title");
    for (int i = 0; i < nodes.getLength(); i++) {
    Element element = (Element) nodes.item(i);
    NodeList title = element.getElementsByTagName("Child tag where cdata present");
    Element line = (Element) title.item(0);
    System.out.println("Title: "+ getCharacterDataFromElement(line));


    public static Document getDomElement(String xml) {
        Document doc = null;
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        dbf.setCoalescing(true);
        dbf.setNamespaceAware(true);
        try {
            DocumentBuilder db = dbf.newDocumentBuilder();
            InputSource is = new InputSource();
            is.setCharacterStream(new StringReader(xml));
            doc = db.parse(is);
        } catch (Exception e) {
            e.printStackTrace();
        }
        return doc;
    }

    public static String getCharacterDataFromElement(Element e) {
        Node child = e.getFirstChild();
        if (child instanceof CharacterData) {
            CharacterData cd = (CharacterData) child;
            return cd.getData();
        }
        return "";
    }
查看更多
疯言疯语
3楼-- · 2019-01-15 02:53
public static void main(String[] args) throws Exception {
  File file = new File("data.xml");
  DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
 //if you are using this code for blackberry xml parsing
  builder.setCoalescing(true);
  Document doc = builder.parse(file);

  NodeList nodes = doc.getElementsByTagName("topic");
  for (int i = 0; i < nodes.getLength(); i++) {
    Element element = (Element) nodes.item(i);
    NodeList title = element.getElementsByTagName("title");
    Element line = (Element) title.item(0);
    System.out.println("Title: " + getCharacterDataFromElement(line));
  }
}
public static String getCharacterDataFromElement(Element e) {
  Node child = e.getFirstChild();
  if (child instanceof CharacterData) {
    CharacterData cd = (CharacterData) child;
    return cd.getData();
  }
  return "";
}

( http://www.java2s.com/Code/Java/XML/GetcharacterdataCDATAfromxmldocument.htm )

查看更多
萌系小妹纸
4楼-- · 2019-01-15 02:55

CDATA just says that the included data should not be escaped. So, just take the tag text. XML parser should return the clear data without CDATA.

查看更多
甜甜的少女心
5楼-- · 2019-01-15 02:55

Since all previous answers are using a DOM based approach. This is how to parse CDATA with a stream based approach using STAX.

Use the following pattern:

  switch (EventType) {
        case XMLStreamConstants.CHARACTERS:
        case XMLStreamConstants.CDATA:
            System.out.println(r.getText());
            break;
        default:
            break;
        }

Complete sample:

import java.io.BufferedInputStream;
import java.io.FileInputStream;
import java.io.InputStream;

import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamConstants;
import javax.xml.stream.XMLStreamReader;

public void readCDATAFromXMLUsingStax() {
    String yourSampleFile = "/path/toYour/sample/file.xml";
    XMLStreamReader r = null;
    try (InputStream in =
            new BufferedInputStream(new FileInputStream(yourSampleFile));) {
        XMLInputFactory factory = XMLInputFactory.newInstance();
        r = factory.createXMLStreamReader(in);
        while (r.hasNext()) {
            switch (r.getEventType()) {
            case XMLStreamConstants.CHARACTERS:
            case XMLStreamConstants.CDATA:
                System.out.println(r.getText());
                break;
            default:
                break;
            }
            r.next();
        }
    } catch (Exception e) {
        throw new RuntimeException(e);
    } finally {
        if (r != null) {
            try {
                r.close();
            } catch (Exception e) {
                throw new RuntimeException(e);
            }
        }
    }
}

With /path/toYour/sample/file.xml

 <data>
    <![CDATA[ Sat Nov 19 18:50:15 2016 (1672822)]]>
    <![CDATA[Sat, 19 Nov 2016 18:50:14 -0800 (PST)]]>
 </data>

Gives:

 Sat Nov 19 18:50:15 2016 (1672822)                             
 Sat, 19 Nov 2016 18:50:14 -0800 (PST)       
查看更多
登录 后发表回答