This question already has an answer here:
- How to parse XML in Bash? 15 answers
I would like to know what would be the best way to parse an XML file using shellscript ?
- Should one do it by hand ?
- Does third tiers library exist ?
If you already made it if you could let me know how did you manage to do it
Here's a full working example.
If it's only extracting email addresses you could just do something like:
1) Suppose XML file spam.xml is like
2) You can get the emails and process them with this short bash code:
Result of this example is:
Important note:
Don't use this for serious matters. This is OK for playing around, getting quick results, learning grep, etc. but you should definitely look for, learn and use an XML parser for production (see Micha's comment below).
Try using xpath. You can use it to parse elements out of an xml tree.
http://www.ibm.com/developerworks/xml/library/x-tipclp/index.html
You could try xmllint
It allows you select elements in the XML doc by xpath, using the --pattern option.
On Mac OS X (Yosemite), it is installed by default.
On Ubuntu, if it is not already installed, you can run
apt-get install libxml2-utils
A rather new project is the xml-coreutils package featuring xml-cat, xml-cp, xml-cut, xml-grep, ...
http://xml-coreutils.sourceforge.net/contents.html
There's also xmlstarlet (which is available for Windows as well).
http://xmlstar.sourceforge.net/doc/xmlstarlet.txt
Here's a function which will convert XML name-value pairs and attributes into bash variables.
http://www.humbug.in/2010/parse-simple-xml-files-using-bash-extract-name-value-pairs-and-attributes/