This a quite annoying but rather a much simpler task. According to this guide, I wrote this:
#!/bin/bash
content=$(wget "https://example.com/" -O -)
ampersand=$(echo '\&')
xmllint --html --xpath '//*[@id="table"]/tbody' - <<<"$content" 2>/dev/null |
xmlstarlet sel -t \
-m "/tbody/tr/td" \
-o "https://example.com" \
-v "a//@href" \
-o "/?A=1" \
-o "$ampersand" \
-o "B=2" -n \
I successfully extract each link from the table and everything gets concatenated correctly, however, instead of reproducing the ampersand as & I receive this at the end of each link:
https://example.com/hello-world/?A=1\&B=2
But actually, I was looking for something like:
https://example.com/hello-world/?A=1&B=2
The idea is to escape the character using a backslash \&
so that it gets ignored. Initially, I tried placing it directly into -o "\&" \
instead of -o "$ampersand" \
and removing ampersand=$(echo '\&')
in this case scenario. Still the same result.
Essentially, by removing the backslash it still outputs:
https://example.com/hello-world/?A=1&B=2
Only that the \
behind the &
is removed.
Why?
I'm sure it is something basic that is missing.