I have a directory of XML files. Each file has its own unique identifier. Each file also contains one or more references to other files (in a separate directory), which also have unique IDs.
For example, I have a file named example01.xml
:
<file>
<fileId>xyz123</fileId>
<fileContents>Blah blah Blah</fileContents>
<relatedFiles>
<otherFile href='http://sub.domain.abc.edu/directory/index.php?p=collections/pageview&id=1234'>
<title>Some resource</title>
</otherFile>
<otherFile href='http://sub.domain.abc.edu/directory/index.php?p=collections/pageview&id=4321'>
<title>Some other resource</title>
</otherFile>
</relatedFiles>
</file>
If a file has multiple relatedFiles/otherFile
elements, I need to create a copy of the file for each @href
and rename it, concatinating the value of the unique ID in @href
with the value of fileID
. So, for example, I need to create two copies of file example01.xml
, one named abc01_xyz123.xml
and another named abc0002_xyz123.xml
. This should scale up to create as many copies as there are otherFile
elements.
Right now, I have a bash script that does this if there is only a single otherFile
element, but my scripting skills are limited and I am having trouble figuring out how to process multiple otherFile
elements.
#!/bin/bash
for f in *.xml;
do
name=`xpath -e 'string(//otherFile/@href)' $f 2> /dev/null`
echo "Moving" $f "to" ${name:3}.xml
echo $name
mv $f ${name:3}.xml
done
Thanks in advance.
Something like this might work: