Extract value from xml in shell

2019-07-26 21:21发布

问题:

I have to following xml structure:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE DOC SYSTEM "ts.dtd">
<?xml-stylesheet type="text/css" href="ts.css"?>
<DOC LOCALE="en-US"> 
   <PTXT ID="some.first.id" CONTEXT="">Some text 1</PTXT> 
   <PTXT ID="some.second.id" CONTEXT="">Some text 2</PTXT> 
</DOC>

Now my challenge is to loop on every PTXT tag and do something with the ID and the inner text. For example purposes, let's just say that I need to echo something like

some.first.id Some text 1
some.second.id Some text 2

How can I have that in a shell script?

回答1:

Complete solution with xmlstarlet tool:

xmlstarlet sel -t -m "//PTXT" -v "concat(./@ID,' ',text())" -n input.xml 2>/dev/null

The output:

some.first.id Some text 1
some.second.id Some text 2