I need to sum up the "value" column amount for each value of col1 of the File1 and export it to an output file. I'm new in python and need to do it for thousands of records.
File1
col1 col2 value
559 1 91987224 2400000000
559 0 91987224 100000000
558 0 91987224 100000000
557 2 87978332 500000000
557 1 59966218 2400000000
557 0 64064811 100000000
Desired Output:
col1 Sum
559 2500000000
558 1000000000
557 3000000000
Thanks in advance.
P.S : I can't use the pandas library due to permission issues.I tried the following code. Sharing it with trace backs:
import csv
fin = open("File1.txt","r")
list_txid = {}
num_tx = {}
amount_tx = {}
for line in fin:
line = line.rstrip()
f = line.split("\t")
txid = f[0]
amount = int(f[3])
fin.close()
for txid in list_txid:
num_tx[txid] += 1
amount_tx[txid] += amount
print("{0}\t{1:d}\t{2:d}".format(txid, amount_tx[txid]))
Traceback :
Traceback (most recent call last): File "C:\Users....\sum.py", line 14, in amount = int(f[3]) IndexError: list index out of range
You can use
pandas
for this:Use
read_csv
for createDataFrame
, thengroupby
by index bylevel=0
and aggregatesum
. Last exportto_csv
: