AWK based on row name and export the following 3 a

2019-09-16 02:07发布

I have tab delimited .txt files looking like this:

""
"5 um"
"Lipid droplet number"
"Lipid droplet diameter"
"Mito"
22
0
5 um
64 255 0 0
2
1615 2022
2037 2021
1
Lipid droplet number
64 255 0 0
1
583 1945
0
Lipid droplet diameter
64 255 0 0
2
1406 849
1364 882
0
Lipid droplet diameter
64 255 0 0
2
1105 1333
1082 1369
0
Lipid droplet diameter
64 255 0 0
2
619 1932
580 1953
0

I want to make a new .txt files including only the 2nd and 3rd row following all places with rowname "Lipid droplet diameter".

The result should be something like this: (I want to delete all other rows than with those with diameter info)

1406 849
1364 882
1105 1333
1082 1369
619 1932
580 1953  

2 columns and 2 rows is ok. 4 columns and 1 row is also ok. This one is best for Excel I guess.

标签: awk grep
3条回答
Animai°情兽
2楼-- · 2019-09-16 02:21

An ugly getline awk

awk '/diameter/ {getline;getline;getline;a=a?a" "$0:$0;getline;b=b?b" "$0:$0} END {print a"\n"b}' file
1406 849 1105 1333 619 1932
1364 882 1082 1369 580 1953

A better version

awk '/diameter/ {f=NR} f && f+3==NR {a=a?a" "$0:$0} f && f+4==NR {b=b?b" "$0:$0} END {print a"\n"b}' file
1406 849 1105 1333 619 1932
1364 882 1082 1369 580 1953

Better table view:

awk '/diameter/ {f=NR} f && f+3==NR {a=a?a"\t"$0:$0} f && f+4==NR {b=b?b"\t"$0:$0} END {print "Column1\t\tColumn2\t\tColumn3\n" a"\n"b}' file
Column1         Column2         Column3
1406 849        1105 1333       619 1932
1364 882        1082 1369       580 1953
查看更多
我想做一个坏孩纸
3楼-- · 2019-09-16 02:24

Using sed:

sed -n '/Lipid droplet diameter/{n;n;n;N;p}' input

Gives:

1406 849
1364 882
1105 1333
1082 1369
619 1932
580 1953

Another method:

grep -A 4 'Lipid droplet diameter' input | sed -n '/--/!p' | \ 
    awk ' (NR-1)%5>2 { print }'
查看更多
Viruses.
4楼-- · 2019-09-16 02:43

For variable rows and columns, you could try this:

BEGIN {
    OFS="\t"; SUBSEP="@"; MAXROWS=1000
}

/^Lipid droplet diameter$/ {
    cols++; rows=0
    while (NF != 2) getline
    while (NF == 2 && rows < MAXROWS) {
        vectors[cols, ++rows] = $0
        getline
    }
}

END {
    for (c = 1; c <= cols; c++) printf("Column%i%c", c, c<cols ? OFS : "\n")
    for (r = 1; r <= rows; r++) {
        for (c = 1; c <= cols; c++) printf("%s%c", vectors[c, r], c<cols ? OFS : "\n")
    }
}

Example assuming the above is saved as lipid.awk:

awk -f lipid.awk input
查看更多
登录 后发表回答