how to cut columns of csv

2020-05-31 05:23发布

I have a set of csv files (around 250), each having 300 to 500 records. I need to cut 2 or 3 columns from each file and store it to another one. I'm using ubuntu OS. Is there any way to do it in command or utility?

标签: shell ubuntu csv
4条回答
放我归山
2楼-- · 2020-05-31 05:39

If you know that the column delimiter does not occur inside the fields, you can use cut.

$ cat in.csv
foo,bar,baz
qux,quux,quuux
$ cut -d, -f2,3 < in.csv 
bar,baz
quux,quuux

You can use the shell buildin 'for' to loop over all input files.

查看更多
Root(大扎)
3楼-- · 2020-05-31 05:49

If the fields might contain the delimiter, you ought to find a library that can parse CSV files. Typically, general purpose scripting languages will include a CSV module in their standard library.

Ruby:   require 'csv'
Python: import csv
Perl:   use Text::ParseWords;
查看更多
迷人小祖宗
4楼-- · 2020-05-31 05:54

If your fields contain commas or newlines, you can use a helper program I wrote to allow cut (and other UNIX text processing tools) to properly work with the data.

https://github.com/dbro/csvquote

This program finds special characters inside quoted fields, and temporarily replaces them with nonprinting characters which won't confuse the cut program. Then they get restored after cut is done.

lutz' solution would become:

csvquote in.csv | cut -d, -f2,3 | csvquote -u 
查看更多
我命由我不由天
5楼-- · 2020-05-31 06:01

If you used ssconvert to get the CSV you might try:

ssconvert -O 'separator="|"' "file.xls" "file.txt"

Notice the TXT extension instead CSV, this way will use Gnumeric_stf:stf_assistant exporter instead of Gnumeric_stf:stf_csv, which let you use options (-O parameter). Otherwise you'll get a The file saver does not take options error. Pipe character is much more unlikely, but you might want to check before.

Then you can rename it and do things like:

cat file.csv | cut -d "|" -f3 | sort | uniq -c | sort -rn | head
查看更多
登录 后发表回答