Is there a way to use bash to remove the last four columns for some input CSV file? The last four columns can have fields that vary in length from line to line so it is not sufficient to just delete a certain number of characters from the end of each row.
问题:
回答1:
Cut can do this if all lines have the same number of fields or awk if you don't.
cut -d, -f1-6 # assuming 10 fields
Will print out the first 6 fields if you want to control the output seperater use --output-delimiter=string
awk -F , -v OFS=, '{ for (i=1;i<=NF-4;i++){ printf $i, }; printf "\n"}'
Loops over fields up to th number of fields -4 and prints them out.
回答2:
cat data.csv | rev | cut -d, -f-5 | rev
rev
reverses the lines, so it doesn't matter if all the rows have the same number of columns, it will always remove the last 4. This only works if the last 4 columns don't contain any commas themselves.
回答3:
You can use cut
for this if you know the number of columns. For example, if your file has 9 columns, and comma is your delimiter:
cut -d',' -f -5
However, this assumes the data in your csv file does not contain any commas. cut
will interpret commas inside of quotes as delimiters also.
回答4:
awk -F, '{NF-=4; OFS=","; print}' file.csv
or alternatively
awk -F, -vOFS=, '{NF-=4;print}' file.csv
will drop the last 4 columns from each line.
回答5:
awk one-liner:
awk -F, '{for(i=0;++i<=NF-5;)printf $i", ";print $(NF-4)}' file.csv
the advantage of using awk over cut is, you don't have to count how many columns do you have, and how many columns you want to keep. Since what you want is removing last 4 columns.
see the test:
kent$ seq 40|xargs -n10|sed 's/ /, /g'
1, 2, 3, 4, 5, 6, 7, 8, 9, 10
11, 12, 13, 14, 15, 16, 17, 18, 19, 20
21, 22, 23, 24, 25, 26, 27, 28, 29, 30
31, 32, 33, 34, 35, 36, 37, 38, 39, 40
kent$ seq 40|xargs -n10|sed 's/ /, /g' |awk -F, '{for(i=0;++i<=NF-5;)printf $i", ";print $(NF-4)}'
1, 2, 3, 4, 5, 6
11, 12, 13, 14, 15, 16
21, 22, 23, 24, 25, 26
31, 32, 33, 34, 35, 36
回答6:
This might work for you (GNU sed):
sed -r 's/(,[^,]*){4}$//' file
回答7:
This awk solution in a hacked way
awk -F, 'OFS=","{for(i=NF; i>=NF-4; --i) {$i=""}}{gsub(",,,,,","",$0);print $0}' temp.txt