读取XML CDATA在Java中(Reading CDATA XML in Java)

2019-09-19 18:18发布

我试图解析CDATA tpyes XML格式。 该代码运行正常,它将打印链接:在控制台(约50倍,因为这是我有多少链接有),但链接不会出现......它只是一个空白控制台空间。 我能怎么会丢失?``

package Parse;

import java.io.File;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.CharacterData;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;

public class XMLParse {
  public static void main(String[] args) throws Exception {
    File file = new File("c:test/returnfeed.xml");
    DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
    Document doc = builder.parse(file);

    NodeList nodes = doc.getElementsByTagName("video");
    for (int i = 0; i < nodes.getLength(); i++) {
      Element element = (Element) nodes.item(i);
      NodeList title = element.getElementsByTagName("videoURL");
      Element line = (Element) title.item(0);
      System.out.println("Links: " + getCharacterDataFromElement(line));
    }
  }
  public static String getCharacterDataFromElement(Element e) {
    Node child = e.getFirstChild();
    if (child instanceof CharacterData) {
      CharacterData cd = (CharacterData) child;
      return cd.getData();
    }
    return "";
  }
}

结果:

Links: 

Links: 

Links: 

Links: 

Links: 

Links: 

Links: 

示例XML:(不完整的文档)

<?xml version="1.0" ?> 
<response xmlns:uma="http://websiteremoved.com/" version="1.0">

    <timestamp>
        <![CDATA[  July 18, 2012 5:52:33 PM PDT 
          ]]> 
    </timestamp>
    <resultsOffset>
        <![CDATA[  0 
          ]]> 
    </resultsOffset>
    <status>
        <![CDATA[  success 
        ]]> 
    </status>
    <resultsLimit>
        <![CDATA[  207 
        ]]> 
    </resultsLimit>
    <resultsCount>
        <![CDATA[  207 
        ]]> 
    </resultsCount>
    <videoCollection>
        <name>
            <![CDATA[  Video API 
            ]]> 
        </name>
        <count>
            <![CDATA[  207 
            ]]> 
        </count>
        <description>
            <![CDATA[  
            ]]> 
        </description>
        <videos>
            <video>
                <id>
                    <![CDATA[  8177840 
                    ]]> 
                </id>
                <headline>
                    <![CDATA[  Test1
                    ]]> 
                </headline>
                <shortHeadline>
                    <![CDATA[  Test2
                    ]]> 
                </shortHeadline>
                <description>
                    <![CDATA[ Test3

                    ]]> 
                </description>
                <shortDescription>
                    <![CDATA[ Test4

                    ]]> 
                </shortDescription>
                <posterImage>
                    <![CDATA[ http://a.com.com/media/motion/2012/0718/los_120718_los_bucher_on_howard.jpg

                    ]]> 
                </posterImage>
                <videoURL>
                    <![CDATA[ http://com/removed/2012/0718/los_120718_los_bucher_on_howard.mp4

                    ]]> 
                </videoURL>
            </video>
        </videos>
    </videoCollection>
</response>

Answer 1:

而不是检查的第一个孩子,这将是审慎的节点是否有其他的孩子们。 你的情况(我想如果你已经调试节点,你就已经知道),传递给方法的节点getCharacterDataFromElement有多个孩子。 我更新的代码,这一个可能给你的指针指向正确的方向:

public static String getCharacterDataFromElement(Element e) {

    NodeList list = e.getChildNodes();
    String data;

    for(int index = 0; index < list.getLength(); index++){
        if(list.item(index) instanceof CharacterData){
            CharacterData child = (CharacterData) list.item(index);
            data = child.getData();

            if(data != null && data.trim().length() > 0)
                return child.getData();
        }
    }
    return "";
}


Answer 2:

我会考虑使用getTextContent()

String string = cdataNode.getTextContent();


文章来源: Reading CDATA XML in Java
标签: java xml parsing