Relocation strings using awk/sed from a index file

2019-09-06 14:12发布

I'd always appreciate all helps from this website. I would like to relocate strings based on the index number from an index file.

Index numbers are shown on the first column in the index file (index.txt) and I would like to relocate "path" based on index numbers. Paths are placed in the same row if the index number is the same. For example, there are two zeros so path_sparc_ifu_dec_in_3826 is placed on the first row and path_sparc_ifu_dec_in_4349 is placed on the first row and next to path_sparc_ifu_dec_in_3826.

index.txt:

 0        path_sparc_ifu_dec_in_3826  str    DR     -         -
 0        path_sparc_ifu_dec_in_4349  stf    DR     -         -
 1        path_sparc_ifu_dec_in_2374  stf    DR     -         -
 1        path_sparc_ifu_dec_in_4011  stf    DR     -         -
 2        path_sparc_ifu_dec_in_3078  stf    DR     -         -

However, strings are written in another file (source.txt) and each "path" has four lines of strings.

source.txt:

path_sparc_ifu_dec_in_3826
dtu_inst_d[14]
dec_fcl_rdsr_sel_pc_d
0.8664
path_sparc_ifu_dec_in_4349
dtu_inst_d[18]
dec_swl_rdsr_sel_thr_d
0.795429
path_sparc_ifu_dec_in_2374
dtu_inst_d[13]
dec_dcl_cctype_d[2]
0.938914
path_sparc_ifu_dec_in_4011
dtu_inst_d[13]
ifu_exu_useimm_d
0.843643
path_sparc_ifu_dec_in_3078
dtu_inst_d[12]
ifu_exu_shiftop_d[2]
0.915818

The desired output is:

path_sparc_ifu_dec_in_3826      path_sparc_ifu_dec_in_4349
dtu_inst_d[14]      dtu_inst_d[18]
dec_fcl_rdsr_sel_pc_d       dec_swl_rdsr_sel_thr_d
0.8664  0.795429
path_sparc_ifu_dec_in_2374      path_sparc_ifu_dec_in_4011
dtu_inst_d[13]      dtu_inst_d[13]
dec_dcl_cctype_d[2]     ifu_exu_useimm_d
0.938914    0.843643
path_sparc_ifu_dec_in_3078  
dtu_inst_d[12]  
ifu_exu_shiftop_d[2]    
0.915818    

My idea is that (1)combining two files first and (2) relocate path info using the index number, but I don't know how to do this work. Probably, sed/awk is an appropriate language.

Any help is appreciated.

Best,

Jaeyoung

2条回答
爷的心禁止访问
2楼-- · 2019-09-06 15:11

This is another code that works for me.

awk '
NR==FNR         {T[$2] = $1
                 MX = $1
                 next
                }
$1 in T         {IX = T[$1]
                }
                {P[IX, (FNR+3)%4] = P[IX, (FNR+3)%4] "\t" $0
                }
END             {for (i=0; i<=MX; i++) for (j=0; j<4; j++) print P[i, j]
                }
' index.txt source.txt
查看更多
霸刀☆藐视天下
3楼-- · 2019-09-06 15:17

a one line awk solution could be :

awk -F'\t' 'FNR==NR{ind[$2]=$1;next} { if($1 in ind) { l=4*ind[$1]} else {l=l+1}; text[l]=text[l]"\t"$1 } END { for (i = 0; i < length(text); i++) {print substr(text[i],2)} }' index.txt source.txt

Explanation :

-F'\t' 

This is to use tab as separator

FNR==NR

To process file after file

{ind[$2]=$1;next}

Use the first file to create an index

if($1 in ind) { l=4*ind[$1]} else {l=l+1}

"l" is the line number in the output file. If the string is in the index the line number is index*4. If it is not in the index it's the previous line number + 1.

text[l]=text[l]"\t"$1

Add the current string to the correct line.

END { for (i = 0; i < length(text); i++) {print substr(text[i],2)} }

At the end print everything. The subrstr is only here to delete the first useless tab (first char) of each line

My output from your data :

path_sparc_ifu_dec_in_3826  path_sparc_ifu_dec_in_4349
dtu_inst_d[14]  dtu_inst_d[18]
dec_fcl_rdsr_sel_pc_d   dec_swl_rdsr_sel_thr_d
0.8664  0.795429
path_sparc_ifu_dec_in_2374  path_sparc_ifu_dec_in_4011
dtu_inst_d[13]  dtu_inst_d[13]
dec_dcl_cctype_d[2] ifu_exu_useimm_d
0.938914    0.843643
path_sparc_ifu_dec_in_3078
dtu_inst_d[12]
ifu_exu_shiftop_d[2]
0.915818
查看更多
登录 后发表回答