I have already extracted the tag from the source document using grep but, now I cant seem to figure out how to easily extract the properties from the string. Also I want to avoid having to use any programs that would not usually be present on a standard installation.
$tag='<img src="http://imgs.xkcd.com/comics/barrel_cropped_(1).jpg" title="Don't we all." alt="Barrel - Part 1" />'
I need to end up with the following variables
$src="http://imgs.xkcd.com/comics/barrel_cropped_(1).jpg"
$title="Don't we all."
$alt="Barrel - Part 1"
You can use xmlstarlet. Then, you don't even have to extract the element yourself:
$ echo $tag|xmlstarlet sel -t --value-of '//img/@src'
http://imgs.xkcd.com/comics/barrel_cropped_(1).jpg
You can even turn this into a function
$ get_attribute() {
echo $1 | xmlstarlet sel -t -o """ -v $2 -o """
}
$ src=get_attribute $tag '//img/@src'
If you don't want to reparse the document several times, you can also do:
$ get_values() {
eval file=\${$#}
eval $#=
cmd="xmlstarlet sel "
for arg in $@
do
if [ -n $arg ]
then
var=${arg%%\=*}
expr=${arg#*=}
cmd+=" -t -o \"$var="\" -v $expr -o \""\" -n"
fi
done
eval $cmd $file
}
$ eval $(get_values src='//img/@src' title='//img/@title' your_file.xml)
$ echo $src
http://imgs.xkcd.com/comics/barrel_cropped_(1).jpg
$ echo $title
Don't we all.
I'm sure there's a better way to remove the last argument to a shell function, but I don't know it.
I went with dacracot's suggestion of using sed although I would have prefered if he had given me some sample code
src=`echo $tag | sed 's/.*src=["]\(.*\)["] title=["]\(.*\)["] alt=["]\(.*\)["].*/\1/'`
title=`echo $tag | sed 's/.*src=["]\(.*\)["] title=["]\(.*\)["] alt=["]\(.*\)["].*/\2/'`
alt=`echo $tag | sed 's/.*src=["]\(.*\)["] title=["]\(.*\)["] alt=["]\(.*\)["].*/\3/'`
If xmlstarlet is available on a standard installation and the sequence of src-title-alt does not change, you can use the following code as well:
tag='<img src="http://imgs.xkcd.com/comics/barrel_cropped_(1).jpg" title="Don'"'"'t we all." alt="Barrel - Part 1" />'
xmlstarlet sel -T -t -m "/img" -m "@*" -v '.' -n <<< "$tag"
IFS=$'\n'
array=( $(xmlstarlet sel -T -t -m "/img" -m "@*" -v '.' -n <<< "$tag") )
src="${array[0]}"
title="${array[1]}"
alt="${array[2]}"
printf "%s\n" "src: $src" "title: $title" "alt: $alt"
Since this bubbled up again, there is now my Xidel that has 2 features which make this task trivial:
So it becomes a single line:
eval $(xidel "$tag" -e '<img src="{$src}" title="{$title}" alt="{$alt}"/>' --output-format bash)