bash method to remove last 4 columns from csv file

2019-04-06 09:32发布

问题:

Is there a way to use bash to remove the last four columns for some input CSV file? The last four columns can have fields that vary in length from line to line so it is not sufficient to just delete a certain number of characters from the end of each row.

回答1:

Cut can do this if all lines have the same number of fields or awk if you don't.

cut -d, -f1-6 # assuming 10 fields

Will print out the first 6 fields if you want to control the output seperater use --output-delimiter=string

awk -F , -v OFS=, '{ for (i=1;i<=NF-4;i++){ printf $i, }; printf "\n"}'

Loops over fields up to th number of fields -4 and prints them out.



回答2:

cat data.csv | rev | cut -d, -f-5 | rev

rev reverses the lines, so it doesn't matter if all the rows have the same number of columns, it will always remove the last 4. This only works if the last 4 columns don't contain any commas themselves.



回答3:

You can use cut for this if you know the number of columns. For example, if your file has 9 columns, and comma is your delimiter:

cut -d',' -f -5

However, this assumes the data in your csv file does not contain any commas. cut will interpret commas inside of quotes as delimiters also.



回答4:

awk -F, '{NF-=4; OFS=","; print}' file.csv

or alternatively

awk -F, -vOFS=, '{NF-=4;print}' file.csv

will drop the last 4 columns from each line.



回答5:

awk one-liner:

awk -F, '{for(i=0;++i<=NF-5;)printf $i", ";print $(NF-4)}'  file.csv

the advantage of using awk over cut is, you don't have to count how many columns do you have, and how many columns you want to keep. Since what you want is removing last 4 columns.

see the test:

kent$  seq 40|xargs -n10|sed 's/ /, /g'           
1, 2, 3, 4, 5, 6, 7, 8, 9, 10
11, 12, 13, 14, 15, 16, 17, 18, 19, 20
21, 22, 23, 24, 25, 26, 27, 28, 29, 30
31, 32, 33, 34, 35, 36, 37, 38, 39, 40

kent$  seq 40|xargs -n10|sed 's/ /, /g' |awk -F, '{for(i=0;++i<=NF-5;)printf $i", ";print $(NF-4)}'
1,  2,  3,  4,  5,  6
11,  12,  13,  14,  15,  16
21,  22,  23,  24,  25,  26
31,  32,  33,  34,  35,  36


回答6:

This might work for you (GNU sed):

sed -r 's/(,[^,]*){4}$//' file


回答7:

This awk solution in a hacked way

awk -F, 'OFS=","{for(i=NF; i>=NF-4; --i) {$i=""}}{gsub(",,,,,","",$0);print $0}' temp.txt


标签: bash csv sed awk cut