Is there a way to completely delete fields in awk,

2019-02-21 09:55发布

问题:

Consider the following command:

gawk -F"\t" "BEGIN{OFS=\"\t\"}{$2=$3=\"\"; print $0}" Input.tsv

When I set $2 = $3 = "", the intended effect to get the same effect as writing:

print $1,$4,$5...$NF

However, what actually happens is that I get two empty fields, with the extra field delimiters still printing.

Is it possible to actually delete $2 and $3?

Note: If this was on Linux in bash, the correct statement above would be the following, but Windows does not handle single quotes well in cmd.exe.

gawk -F'\t' 'BEGIN{OFS="\t"}{$2=$3=""; print $0}' Input.tsv

回答1:

This is an oldie but goodie.

As Jonathan points out, you can't delete fields in the middle, but you can replace their contents with the contents of other fields. And you can make a reusable function to handle the deletion for you.

$ cat test.awk
function rmcol(col,     i) {
  for (i=col; i<NF; i++) {
    $i=$(i+1)
  }
  NF--
}

{
  rmcol(3)
}

1

$ printf 'one two three four\ntest red green blue\n' | awk -f test.awk
one two four
test red blue


回答2:

You can't delete fields in the middle, but you can delete fields at the end, by decrementing NF.

So you can shift all the later fields down to overwrite $2 and $3 then decrement NF by two, which erases the last two fields:

$ echo 1 2 3 4 5 6 7 | awk '{for(i=2; i<NF-1; ++i) $i=$(i+2); NF-=2; print $0}'
1 4 5 6 7


回答3:

If you're just looking to remove columns, you can use cut:

cut -f 1,4- file.txt

To emulate cut:

awk -F "\t" '{ for (i=1; i<=NF; i++) if (i != 2 && i != 3) { if (i == NF) printf $i"\n"; else printf $i"\t" } }' file.txt

Similar:

awk -F "\t" '{ delim =""; for (i=1; i<=NF; i++) if (i != 2 && i != 3) { printf delim $i; delim = "\t"; } printf "\n" }' file.txt

HTH



回答4:

One way could be to remove fields like you do and remove extra spaces with gsub:

awk 'BEGIN { FS = "\t" } { $2 = $3 = ""; gsub( /\s+/, "\t" ); print }' input-file


回答5:

In the addition of the answer by Suicidal Steve I'd like to suggest one more solution but using sed instead awk.

It seems more complicated than usage of cut as it was suggested by Steve. But it was the better solution because sed -i allows editing in-place.

sed -i 's/\(.*,\).*,.*,\(.*\)/\1\2/' FILENAME


回答6:

The only way I can think to do it in Awk without using a loop is to use gsub on $0 to combine adjacent FS:

$ echo {1..10} | awk '{$2=$3=""; gsub(FS"+",FS); print}'
1 4 5 6 7 8 9 10


回答7:

well, if the goal is to remove the extra delimiters then you can use "tr" on Linux. Example:

$ echo "1,2,,,5" | tr -s ','

1,2,5



回答8:

echo one two three four five six|awk '{
print $0
is3=$3
$3=""
print $0
print is3
}'

one two three four five six

one two four five six

three



标签: awk gawk