I want to sax-parse in nokogiri, but when it comes to parse xml element that have a long and crazy xml element name or a attribute on it.. then everthing goes crazy.
Fore instans if I like to parse this xml file and grab all the title element, how do I do that with nokogiri-sax.
<titles>
<title xml:lang="sv">Arkivvetenskap</title>
<title xml:lang="en">Archival science</title>
</titles>
In your example,
title
is the name of the element.xml:lang="sv"
is an attribute. This parser assumes there are no elements nested inside of title elementsThis prints
SAX parsing is usually way too complex. Because of that, I recommend Nokogiri's standard in-memory parser, or if you really need speed and memory efficiency, Nokogiri's Reader parser.
For comparison, here is a standard Nokogiri parser for the same document
And here is a reader parser for the same document