-->

xpath html get all columns 1 and 2 together and co

2019-09-21 15:38发布

问题:

i have this following command that gets the data from column 2:

Table example:

<table>
    <tr>
        <td>a</td>
        <td>b</td>
        <td>c</td>
        <td>d</td>
        <td>e</td>
    </tr>
    <tr>
        <td>1</td>
        <td>2</td>
        <td>3</td>
        <td>4</td>
        <td>5</td>
    </tr>
</table>



wget -q -O - http://www.example.com | xmllint --html --xpath "//table[@id=\"tableID\"]//tr//td[position() = 2]//text() - 2>/dev/null

That outputs something like:

12345

how can I get all both column 1 and column 2 with ":" symbol that appends on each line?

Desired output:

a:1
b:2
c:3
d:4
e:5

Thanks in advanced!


I also asked similar question here with output example and desired output: xpath html combine columns

回答1:

With xmlstarlet and awk:

wget -q -O - "http://www.example.com" | xmlstarlet sel -t -v "//tr/td" -n \
| awk -F'\n' -v RS= '{ n=NF/2; for(i=1;i<=n;i++) print $i ":" $(i+n) }'

The output:

a:1
b:2
c:3
d:4
e:5


回答2:

resolved from: xpath html combine columns

solution:

wget -q -O - "https://socks-proxy.net" \
| xmllint --html --xpath "//table[@id='proxylisttable']//tr//td[position() < 3]" - 2>/dev/null 
| tidy -cq -omit -f /dev/null | xmllint --html --xpath "//td/text()" - | paste - - -d':'

many thanks to @RomanPerekhrest