Outputting multiple files using XPath in bash

2019-08-09 21:42发布

问题:

I have a directory of XML files. Each file has its own unique identifier. Each file also contains one or more references to other files (in a separate directory), which also have unique IDs.

For example, I have a file named example01.xml:

<file>
    <fileId>xyz123</fileId>
    <fileContents>Blah blah Blah</fileContents>
    <relatedFiles>
        <otherFile href='http://sub.domain.abc.edu/directory/index.php?p=collections/pageview&amp;id=123‌​4'>
            <title>Some resource</title>
        </otherFile>
        <otherFile href='http://sub.domain.abc.edu/directory/index.php?p=collections/pageview&amp;id=4321'>
            <title>Some other resource</title>
        </otherFile>
    </relatedFiles>
</file>

If a file has multiple relatedFiles/otherFile elements, I need to create a copy of the file for each @href and rename it, concatinating the value of the unique ID in @href with the value of fileID. So, for example, I need to create two copies of file example01.xml, one named abc01_xyz123.xml and another named abc0002_xyz123.xml. This should scale up to create as many copies as there are otherFile elements.

Right now, I have a bash script that does this if there is only a single otherFile element, but my scripting skills are limited and I am having trouble figuring out how to process multiple otherFile elements.

#!/bin/bash
for f in *.xml; 
    do 
        name=`xpath -e 'string(//otherFile/@href)' $f 2> /dev/null`
        echo  "Moving" $f "to" ${name:3}.xml
        echo $name
        mv $f ${name:3}.xml
    done

Thanks in advance.

回答1:

Something like this might work:

#!/bin/bash

for f in *.xml; do
  fid=$(xpath -e '//fileId/text()' "$f" 2>/dev/null)
  for uid in $(xpath -e '//otherFile/@href' "$f" 2>/dev/null | awk -F= '{gsub(/"/,"",$0); print $3}'); do
    echo  "Moving $f to ${fid}_${uid}.xml"
    cp "$f" "${fid}_${uid}.xml"
  done
  rm "$f"
done