Linux command or script counting duplicated lines

2019-01-29 22:13发布

If I have a text file with the following conent

red apple
green apple
green apple
orange
orange
orange

Is there a Linux command or script that I can use to get the following result?

1 red apple
2 green apple
3 orange

7条回答
聊天终结者
2楼-- · 2019-01-29 22:59

Send it through sort (to put adjacent items together) then uniq -c to give counts, i.e.:

sort filename | uniq -c

and to get that list in sorted order (by frequency) you can

sort filename | uniq -c | sort -nr
查看更多
手持菜刀,她持情操
3楼-- · 2019-01-29 23:03

uniq -c file

and in case the file is not sorted already:

sort file | uniq -c

查看更多
乱世女痞
4楼-- · 2019-01-29 23:04

Try this

cat myfile.txt| sort| uniq
查看更多
祖国的老花朵
5楼-- · 2019-01-29 23:09

Almost the same as borribles' but if you add the d param to uniq it only shows duplicates.

sort filename | uniq -cd | sort -nr
查看更多
Anthone
6楼-- · 2019-01-29 23:12

To just get a count:

$> egrep -o '\w+' fruits.txt | sort | uniq -c

      3 apple
      2 green
      1 oragen
      2 orange
      1 red

To get a sorted count:

$> egrep -o '\w+' fruits.txt | sort | uniq -c | sort -nk1
      1 oragen
      1 red
      2 green
      2 orange
      3 apple

EDIT

Aha, this was NOT along word boundaries, my bad. Here's the command to use for full lines:

$> cat fruits.txt | sort | uniq -c | sort -nk1
      1 oragen
      1 red apple
      2 green apple
      2 orange
查看更多
一纸荒年 Trace。
7楼-- · 2019-01-29 23:14
cat <filename> | sort | uniq -c
查看更多
登录 后发表回答