Get all the tags and values from XML using SAX par

2019-08-17 06:52发布

问题:

I am trying to parse xml using SAX. I want all the tags and their values from xml in nested way. Is it possible with SAX parser. Can anyone provide me an example. (I think SAX is efficient than w3 document builder, So I chose it. And I want to know weather I'm on the right path) I'm attaching my java program

class MySAXApp extends DefaultHandler
{
    public MySAXApp ()
    {
        super();
    }
    public void startDocument ()
    {
        System.out.println("Start document");
    }
    public void endDocument ()
    {
        System.out.println("End document");
    }


    public void startElement (String uri, String name,
            String qName, Attributes atts)
    {
        System.out.println(atts.getLength());
        if ("".equals (uri))
            System.out.println("Start element: " + qName);
        else
            System.out.println("Start element: {" + uri + "}" + name);
    }

}

Here is my XML. Is this a valid xml? Are there any errors in writing xml like this

<?xml version="1.0" encoding="utf-8"?>
<CustomerReport xsi:schemaLocation="Customer.xsd">
    <Customer>
        <CustomerName>str1234</CustomerName>
        <CustomerStatus>str1234</CustomerStatus>
        <PurchaceOrders>
            <PurchaceOrder>
                <PurchaceOrderName>str1234</PurchaceOrderName>
            </PurchaceOrder>
        </PurchaceOrders>
    </Customer>
</CustomerReport>

I'm new to XML. Can someone help me on this

回答1:

When you say SAX is "more efficient", what you actually mean is that a SAX parser does the minimum amount of work, leaving most of the work to the application. That means you (the application writer) have more code to write, and it's quite tricky code as you are discovering. Because the people who write XML parsers are much more experienced Java coders than you are, it's likely that the more work you do in your code, and the less you do within the parser, the less efficient your overall application will be. So given your level of experience, my advice would be to use a parsing approach where the library does as much as possible of the work. I would suggest using JDOM2.



回答2:

The only attribute you have in the XML you posted is for the attribute with the xsi prefix. For the rest the attribute length should be 0.

Attributes are key-value pairs inside a tag. Most of your xml content is inside of elements.

The efficiency advantage of SAX (or STAX) over something like JDOM is due to the sax parser not maintaining all the data it reads in memory. If you use the contentHandler to retrieve data and save it as it gets read then your program doesn't have to consume that much memory.

Read this tutorial or this Javaworld article. You need to implement a characters method in order to get any element text. Both linked articles have good examples of how to implement your characters method so that you can retrieve element text.

There are a lot of bad examples for this that you are likely to find if you google around (bad example) or search on stackoverflow (bad example here), but the example implementations in the linked articles are correct, because they buffer the output from the characters method until all characters have been found:

Parsers are not required to return any particular number of characters at one time. A parser can return anything from a single character at a time up to several thousand and still be a standard-conforming implementation. So if your application needs to process the characters it sees, it is wise to have the characters() method accumulate the characters in a java.lang.StringBuffer and operate on them only when you are sure that all of them have been found.

Here is the ContentHandler from the JavaWorld article's hello world example changed to use your xml:

import org.xml.sax.*;
import org.xml.sax.helpers.*;
import java.io.*;
public class Example2 extends DefaultHandler {
   // Local variables to store data
   // found in the XML document
   public  String  name       = "";
   public  String  status   = "";
   public String orderName = ""
   // Buffer for collecting data from // the "characters" SAX event.
   private CharArrayWriter contents = new CharArrayWriter();
   // Override methods of the DefaultHandler class
   // to gain notification of SAX Events.
   //
        // See org.xml.sax.ContentHandler for all available events.
   //
   public void startElement( String namespaceURI,
              String localName,
              String qName,
              Attributes attr ) throws SAXException {
      contents.reset();
   }
   public void endElement( String namespaceURI,
              String localName,
              String qName ) throws SAXException {
      if ( localName.equals( "CustomerName" ) ) {
         name = contents.toString();
      }
      if ( localName.equals( "CustomerStatus" ) ) {
         status = contents.toString();
      }
      if (localName.equals("PurchaceOrderName")) {
         orderName = contents.toString();
      }
   }
   public void characters( char[] ch, int start, int length )
                  throws SAXException {
      contents.write( ch, start, length );
   }
}


标签: java xml sax