I want to check the numbers in the 1st column is e

2019-08-22 02:31发布

问题:

I want to check the numbers in the 1st column is equal to 2nd column, and 1st column should be starting with "ABC" and ending with "DEF" but some times it ends with "DEFZ#" numbers between "ABC"######"DEF" or "DEFZ#" should be matching to 2nd column. can anyone help me here please.

My input:

>ABC12345DEF | 12345  |23132331331|
>ABC12345DEFZ1 | 12345  |23132331331|
>ABC12345DEFZ2 | 12345  |23132331331|
>ABC95678DEF | 45678  |23132331331| 
>ABC87887DEF | 86187  |23132331331|
>ABC89043DEF | 89043  |23132331331|
>ABC89043DEFZ1 | 89043  |23132331331|
>ABC89043DEFZ2 | 89043  |23132331331|
>ABC89043DEFZ3 | 89043  |23132331331|

Output Should be:

>ABC12345DEF |12345 |23132331331|

>ABC12345DEFZ1 |12345 |23132331331|

>ABC12345DEFZ2 |12345 |23132331331|

>ABC89043DEFZ1 |89043 |23132331331|

>ABC89043DEFZ2 |89043 |23132331331|

>ABC89043DEFZ3 |89043 |23132331331|

I'm trying to use the following one, but it's not working. awk -F '|' '"ABC" $2 "DEF" != $1 { print }' WHTFile.txt > QC2Valid.txt

回答1:

Could you please try following and let me know if this helps you.

awk -F"|" '
$1 ~ /^ABC[0-9]+DEF[123Z]/ || $1 ~ /^ABC[0-9]+DEF/{
   sub(/ +$/,"",$2);
   match($1,/[0-9]+/);
   if(substr($0,RSTART,RLENGTH)==$2){
     print
}
}
' OFS="|"  Input_file


回答2:

awk solution:

awk -F' *\\| *' '{ match($1,/[0-9]+/) }substr($1,RSTART,RLENGTH)==$2' OFS='|' WHTFile.txt

The output:

ABC12345DEF |12345 |23132331331|
ABC12345DEFZ1 |12345 |23132331331|
ABC12345DEFZ2 |12345 |23132331331|
ABC89043DEF |89043 |23132331331|
ABC89043DEFZ1 |89043 |23132331331|
ABC89043DEFZ2 |89043 |23132331331|
ABC89043DEFZ3 |89043 |23132331331|

Bonus solution using sed expression:

sed -E '/^ABC([0-9]+)DEF[^\s|]*\s*\|\s*\1/!d' WHTFile.txt


标签: linux awk