Count number of occurrences of token in a file

I have a server access log, with timestamps of each http request, I'd like to obtain a count of the number of requests at each second. Using sed, and cut -c, so far I've managed to cut the file down to just the timestamps, such as:

22-Sep-2008 20:00:21 +0000
22-Sep-2008 20:00:22 +0000
22-Sep-2008 20:00:22 +0000
22-Sep-2008 20:00:22 +0000
22-Sep-2008 20:00:24 +0000
22-Sep-2008 20:00:24 +0000

What I'd love to get is the number of times each unique timestamp appears in the file. For example, with the above example, I'd like to get output that looks like:

22-Sep-2008 20:00:21 +0000: 1
22-Sep-2008 20:00:22 +0000: 3
22-Sep-2008 20:00:24 +0000: 2

I've used sort -u to filter the list of timestamps down to a list of unique tokens, hoping that I could use grep like

grep -c -f <file containing patterns> <file>

but this just produces a single line of a grand total of matching lines.

I know this can be done in a single line, stringing a few utilities together ... but I can't think of which. Anyone know?

标签： bash shell grep

6条回答

够拽才男人

2楼-- · 2019-04-06 00:39

I think you're looking for

uniq --count

-c, --count prefix lines by the number of occurrences

0人赞添加讨论(0) 举报

冷血范

3楼-- · 2019-04-06 00:39

Using awk:

cat file.txt | awk '{count[$1 " " $2]++;} \
                    END {for(w in count){print w ": " count[w]};}'

0人赞添加讨论(0) 举报

三岁会撩人

4楼-- · 2019-04-06 00:46

Just in case you want the output in the format you originally specified (with the number of occurences at the end):

uniq -c logfile | sed 's/\([0-9]+\)\(.*\)/\2: \1/'

0人赞添加讨论(0) 举报

可以哭但决不认输i

5楼-- · 2019-04-06 00:48

Tom's solution:

awk '{count[$1 " " $2]++;} END {for(w in count){print w ": " count[w]};}' file.txt

works more generally.

My file was not sorted :

name1 
name2 
name3 
name2 
name2 
name3 
name1

Therefore the occurrences weren't following each other, and uniq does not work as it gives :

1 name1 
1 name2 
1 name3 
2 name2 
1 name3 
1 name1

With the awk script however:

name1:2 
name2:3 
name3:2

0人赞添加讨论(0) 举报

成全新的幸福

6楼-- · 2019-04-06 01:01

Using AWK with associative arrays might be another solution to something like this.

0人赞添加讨论(0) 举报

爱情/是我丢掉的垃圾

7楼-- · 2019-04-06 01:02

maybe use xargs? Can't put it all together in my head on the spot here, but use xargs on your sort -u so that for each unique second you can grep the original file and do a wc -l to get the number.

0人赞添加讨论(0) 举报

Count number of occurrences of token in a file

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间