Python Counter from txt file

I would like to init a collections.Counter object from a text file of word frequency counts. That is, I have a file "counts.txt":

rank  wordform         abs     r        mod
   1  the           225300    29   223066.9
   2  and           157486    29   156214.4
   3  to            134478    29   134044.8
...
 999  fallen           345    29      326.6
1000  supper           368    27      325.8

I would like a Counter object wordCounts such that I can call

>>> print wordCounts.most_common(3)
[('the', 225300), ('of', 157486), ('and', 134478)]

What is the most efficient, Pythonic way

标签： python text-files counter

2条回答

We Are One

2楼-- · 2019-09-10 09:29

Here are two versions. The first takes your counts.txt as a regular text file. The second treats it as a csv file (which is what it kind of looks like).

from collections import Counter

with open('counts.txt') as f:
    lines = [line.strip().split() for line in f]
    wordCounts = Counter({line[1]: int(line[2]) for line in lines[1:]})
    print wordCounts.most_common(3)

If your data file some how turned out to be delimited by some consistent character or string you could use a csv.DictReader object to parse the file.

Shown below is how it could be done IF your file were TAB delimited.

The data file (as edited by me to be TAB delimited)

rank    wordform    abs r   mod
1   the 225300  29  223066.9
2   and 157486  29  156214.4
3   to  134478  29  134044.8
999 fallen  345 29  326.6
1000    supper  368 27  325.8

The code

from csv import DictReader
from collections import Counter

with open('counts.txt') as f:
    reader = DictReader(f, delimiter='\t')
    wordCounts = Counter({row['wordform']: int(row['abs']) for row in reader})
    print wordCounts.most_common(3)

0人赞添加讨论(0) 举报

我想做一个坏孩纸

3楼-- · 2019-09-10 09:29

import collections.Counter

words = dict()
fp = open('counts.txt')

for line in fp:
   items = line.split()
   words[items[1].strip()] = int(items[2].strip())

wordCounts = collections.Counter(words)

0人赞添加讨论(0) 举报

Python Counter from txt file

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间