How to count number of unique values of a field in

2019-03-08 19:09发布

I have a text file with a large amount of data which is tab delimited. I want to have a look at the data such that I can see the unique values in a column. For example,

Red     Ball 1 Sold
Blue    Bat  5 OnSale
............... 

So, its like the first column has colors, so I want to know how many different unique values are there in that column and I want to be able to do that for each column.

I need to do this in a Linux command line, so probably using some bash script, sed, awk or something.

Addendum: Thanks everyone for the help, can I ask one more thing? What if I wanted a count of these unique values as well?

I guess I didn't put the second part clearly enough. What I wanted to do is to have a count of "each" of these unique values not know how many unique values are there. For instance, in the first column I want to know how many Red, Blue, Green etc coloured objects are there.

7条回答
\"骚年 ilove
2楼-- · 2019-03-08 19:58

Assuming the data file is actually Tab separated, not space aligned:

<test.tsv awk '{print $4}' | sort | uniq

Where $4 will be:

  • $1 - Red
  • $2 - Ball
  • $3 - 1
  • $4 - Sold
查看更多
登录 后发表回答