How to print count of occourance of some string in

2019-01-29 07:51发布

I have a CSV file in which contains one column (column1). I want to check whether the element in cell repeats and how many times(occcurance_count).And print count of occurrence in the same CSV file using Python.
In the below example the "241682-27638-USD-OCOF" is not repeating so the count is one, "241942-37190-USD-DIV" is repeated twice so the count is 2 and so on.

Want the output as below in CSV format

column1                  ,occcurance_count

1682-27638-USD-OGGCOF ,1

241682-27638-USD-OGGINT ,1

241682-27638-USD-CIGGNT ,1

241682-27638-USD-OCGGINT ,1

241942-37190-USD-GGDIV ,2

241942-37190-USD-CHYOF ,1

241942-37190-USD-EQPL ,1

241942-37190-USD-INT ,1

242066-15343-USD-CYJOF ,3

242066-15343-USD-CYJOF ,3

242066-15343-USD-CYJOF ,3

242066-15343-USD-ETHQPL ,1

242066-15343-USD-INFRT ,1

241942-37190-USD-GGDIV ,2

242066-33492-USD-CJHOF ,1

4条回答
手持菜刀,她持情操
2楼-- · 2019-01-29 08:20

You could use Counter:

>>> counter = Counter(line[0] for line in values.readlines())

>>> counter['242066-15343-USD-CYJOF']
3

>>> counter['241682-27638-USD-OGGINT]
2
查看更多
做个烂人
3楼-- · 2019-01-29 08:26

As the count repeats you just need a normal dict:

d = {}
with open(infile) as f:
    next(f)
    for line in f:
        spl = line.rstrip().split(",")
        spl[0]= spl[1]

for k,v in d.items():
    print("key = {} count = {}".format(k,v))

If your file posted is actually expected output and you are trying to count each occurrence of a file with a single string on each line and the write the line and count:

from collections import Counter

d = Counter()
with open("i.csv") as f, open("out.csv","w") as out:
    for line in f:
        d.update([line.rstrip()]) # get counts 
    f.seek(0) # g back to start of the file
    out.write("column1, occcurance_count")
    for line in f:
       out.write("{}, {}\n".format(line.rstrip(),d[line.rstrip()])) # write line plus count of that line
查看更多
虎瘦雄心在
4楼-- · 2019-01-29 08:37

I think below is the code which you are looking for. logic is simple but lengthier too. Explanation about logic: first you need to open csv file for reading and list down all elements in list Then use list count method to find out number of occurrence of each list item open the new csv file and write item and count for each item.

Surely there could be optimize way of doing the same thing but here is code which comes quickly.

    import csv
    import sys

    try :
        fr = open("mycsv.csv")
        fw = open("mscsv_counter.csv", "w")
    except:
        print "Couldn't open the file"

    reader = csv.reader(fr)

    counterlist = list()
    for row in reader :
     #   print row
         if len(row) > 0 :
            counterlist.append(row[0])
    #for item in counterlist :
    #    print counterlist.count(item)

    writer = csv.writer(fw)
    data = ["column 1", "counter"]
    writer.writerow(data)
    for item in counterlist :
        rowdata = [item, counterlist.count(item)]
     #   print rowdata
        writer.writerow(rowdata)

    fr.close();
    fw.close();
查看更多
祖国的老花朵
5楼-- · 2019-01-29 08:40

Here is a simple code. Hope this will help you:

>>> import numpy as np
>>> data=np.loadtxt('a.csv', dtype=str)
>>> data
array(['241682-27638-USD-OCOF', '241682-27638-USD-OINT',
       '241682-27638-USD-CINT', '241682-27638-USD-OCINT',
       '241942-37190-USD-DIV', '241942-37190-USD-COF',
       '241942-37190-USD-EQPL', '241942-37190-USD-INT',
       '242066-15343-USD-COF', '242066-15343-USD-COF',
       '242066-15343-USD-COF', '242066-15343-USD-EQPL',
       '242066-15343-USD-INT', '241942-37190-USD-DIV',
       '242066-33492-USD-COF'], 
      dtype='|S22')
>>> count = [len(np.where(data==i)[0]) for i in data]
>>> count
[1, 1, 1, 1, 2, 1, 1, 1, 3, 3, 3, 1, 1, 2, 1]
>>> fp=open('a.csv','w')
    for i in range(data.shape[0]):
        fp.write(str(data[i]) + ' , ' + str(count[i]) + '\n')

    fp.close()
查看更多
登录 后发表回答