How to parse XML using shellscript? [duplicate]

This question already has an answer here:

How to parse XML in Bash? 15 answers

I would like to know what would be the best way to parse an XML file using shellscript ?

Should one do it by hand ?
Does third tiers library exist ?

If you already made it if you could let me know how did you manage to do it

标签： linux bash shell

11条回答

一夜七次

2楼-- · 2019-01-03 14:07

Here's a full working example.
If it's only extracting email addresses you could just do something like:
1) Suppose XML file spam.xml is like

<spam>
<victims>
  <victim>
    <name>The Pope</name>
    <email>pope@vatican.gob.va</email>
    <is_satan>0</is_satan>
  </victim>
  <victim>
    <name>George Bush</name>
    <email>father@nwo.com</email>
    <is_satan>1</is_satan>
  </victim>
  <victim>
    <name>George Bush Jr</name>
    <email>son@nwo.com</email>
    <is_satan>0</is_satan>
  </victim>
</victims>
</spam>

2) You can get the emails and process them with this short bash code:

#!/bin/bash
emails=($(grep -oP '(?<=email>)[^<]+' "/my_path/spam.xml"))

for i in ${!emails[*]}
do
  echo "$i" "${emails[$i]}"
  # instead of echo use the values to send emails, etc
done

Result of this example is:

0 pope@vatican.gob.va
1 father@nwo.com
2 son@nwo.com

Important note:
Don't use this for serious matters. This is OK for playing around, getting quick results, learning grep, etc. but you should definitely look for, learn and use an XML parser for production (see Micha's comment below).

0人赞添加讨论(0) 举报

放我归山

3楼-- · 2019-01-03 14:08

Try using xpath. You can use it to parse elements out of an xml tree.

http://www.ibm.com/developerworks/xml/library/x-tipclp/index.html

0人赞添加讨论(0) 举报

Rolldiameter

4楼-- · 2019-01-03 14:12

You could try xmllint

The xmllint program parses one or more XML files, specified on the command line as xmlfile. It prints various types of output, depending upon the options selected. It is useful for detecting errors both in XML code and in the XML parser itse

It allows you select elements in the XML doc by xpath, using the --pattern option.

On Mac OS X (Yosemite), it is installed by default.
On Ubuntu, if it is not already installed, you can run apt-get install libxml2-utils

0人赞添加讨论(0) 举报