I would like to have help or direction on a problem I have in awk.
I have a tab-delimited file with more than 5 fields. I want to output the fields excluding the first 5 fields.
Could you please tell how to write an awk script to accomplish this task?
Best,
jianfeng.mao
Do Note the following kind comment:
There are many fields in my files. Different lines have a different number of fields. The number of fields per line is not standard.
I agree with matchew's suggestion to use cut
: it's the right tool for this job. But if this is just going to become a part of a larger awk
script, here's how to do it:
awk -F "\t" '{ for (i=6; i<=NF; ++i) $(i-5) = $i; NF = NF-5; print; }
In my tab delimited file temp.txt
it looks like the following
field1 field2 field3 field4 field5 field6
field1 field2 field3 field4 field5 field6 field7
field1 field2 field3 field4 field5 field6 field7 field 8
As per your update, I strongly recommend using cut
:
cut -f6- temp.txt
will print field6 to end of line.
Note -d
specifies the delimiter, but tab is the default delimiter.
You can do this in awk
, but I find cut
to be simpler.
With awk
it would look like this:
awk '{print substr($0, index($0, $6))}' temp.txt
if my tab delimited file temp.txt looks like the following
field1 field2 field3 field4 field5 field6
field1 field2 field3 field4 field5 field6 field7
field1 field2 field3 field4 field5 field6 field7 field 8
awk -F"\t" '{print $6}' temp.txt
will print only the 6th field. if the delimiter is tab it will likely work without setting -F, but I like to set my field-separator when I can.
similarly so too would cut.
cut -f6 temp.txt
I have a hunch your question is a bit more complicated then this, so if you respond to my comment I can try and expand on my answer.
perl way?
perl -lane 'splice @F,0,5;print "@F"'
so,
echo 'field1 field2 field3 field4 field5 field6' | perl -lane 'splice @F,0,5;print "@F"'
will produce
field6
awk -vFS='\t' -vOFS='\t' '{
$1=$2=$3=$4=$5=""
print substr($0,6) # delete leading tabs
}'
I use -vFS='\t'
rather than -F'\t'
because some implementations of awk (e.g. BusyBox's) don't honor C escapes in the latter construction.