I want to check the numbers in the 1st column is e

2019-08-22 02:19发布

I want to check the numbers in the 1st column is equal to 2nd column, and 1st column should be starting with "ABC" and ending with "DEF" but some times it ends with "DEFZ#" numbers between "ABC"######"DEF" or "DEFZ#" should be matching to 2nd column. can anyone help me here please.

My input:

>ABC12345DEF | 12345  |23132331331|
>ABC12345DEFZ1 | 12345  |23132331331|
>ABC12345DEFZ2 | 12345  |23132331331|
>ABC95678DEF | 45678  |23132331331| 
>ABC87887DEF | 86187  |23132331331|
>ABC89043DEF | 89043  |23132331331|
>ABC89043DEFZ1 | 89043  |23132331331|
>ABC89043DEFZ2 | 89043  |23132331331|
>ABC89043DEFZ3 | 89043  |23132331331|

Output Should be:

>ABC12345DEF |12345 |23132331331|

>ABC12345DEFZ1 |12345 |23132331331|

>ABC12345DEFZ2 |12345 |23132331331|

>ABC89043DEFZ1 |89043 |23132331331|

>ABC89043DEFZ2 |89043 |23132331331|

>ABC89043DEFZ3 |89043 |23132331331|

I'm trying to use the following one, but it's not working. awk -F '|' '"ABC" $2 "DEF" != $1 { print }' WHTFile.txt > QC2Valid.txt

标签: linux awk
2条回答
爷、活的狠高调
2楼-- · 2019-08-22 02:49

Could you please try following and let me know if this helps you.

awk -F"|" '
$1 ~ /^ABC[0-9]+DEF[123Z]/ || $1 ~ /^ABC[0-9]+DEF/{
   sub(/ +$/,"",$2);
   match($1,/[0-9]+/);
   if(substr($0,RSTART,RLENGTH)==$2){
     print
}
}
' OFS="|"  Input_file
查看更多
女痞
3楼-- · 2019-08-22 02:56

awk solution:

awk -F' *\\| *' '{ match($1,/[0-9]+/) }substr($1,RSTART,RLENGTH)==$2' OFS='|' WHTFile.txt

The output:

ABC12345DEF |12345 |23132331331|
ABC12345DEFZ1 |12345 |23132331331|
ABC12345DEFZ2 |12345 |23132331331|
ABC89043DEF |89043 |23132331331|
ABC89043DEFZ1 |89043 |23132331331|
ABC89043DEFZ2 |89043 |23132331331|
ABC89043DEFZ3 |89043 |23132331331|

Bonus solution using sed expression:

sed -E '/^ABC([0-9]+)DEF[^\s|]*\s*\|\s*\1/!d' WHTFile.txt
查看更多
登录 后发表回答