I am parsing a huge CSV
approx 2 GB files with the help of this great stuff. Now have to generate dynamic files for each column in a new file where column name as file name. So I written this code to write the dynamic files:
def write_CSV_dynamically(self, header, reader):
"""
:header - CSVs first row in string format
:reader - CSVs all other rows in list format
"""
try:
headerlist =header.split(',') #-- string headers
zipof = lambda x, y: zip(x.split(','), y.split(','))
filename = "{}.csv".format(self.dtstamp)
filename = "{}_"+filename
filesdct = {filename.format(k.strip()):open(filename.format(k.strip()), 'a')\
for k in headerlist}
for row in reader:
for key, data in zipof(header, row):
filesdct[filename.format(key.strip())].write( str(data) +"\n" )
for _, v in filesdct.iteritems():
v.close()
except Exception, e:
print e
Now its taking around 50
secs to write these huge files using 100% CPU
.As there are other heavy things running on my server. I want to block my program to use only 10% to 20% of the CPU and write these files. No matter if it takes 10-15 mins.
How can I optimize my code, so that it should limit 10-20% CPU usage.