I am attempting to merge a number of CSV files. My Initial function is aimed to:
- Look Inside a directory and count the number of files within (assume all are .csv)
- Open the first CSV and append each row into a list
- Clip the top three rows (there's some useless column title info I don't want)
- Store these results in an a list I've called 'archive
- Open the next CSV file and repeat(clip and append em to 'archive')
- When we're out of CSV files I wanted to write the complete 'archive' to a file in separate folder.
So for instance if i were to start with three CSV files that look something like this.
CSV 1
[]
[['Title'],['Date'],['etc']]
[]
[['Spam'],['01/01/2013'],['Spam is the spammiest spam']]
[['Ham'],['01/04/2013'],['ham is ok']]
[['Lamb'],['04/01/2013'],['Welsh like lamb']]
[['Sam'],['01/12/2013'],["Sam doesn't taste as good and the last three"]]
CSV 2
[]
[['Title'],['Date'],['etc']]
[]
[['Dolphin'],['01/01/2013'],['People might get angry if you eat it']]
[['Bear'],['01/04/2013'],['Best of Luck']]
CSV 3
[]
[['Title'],['Date'],['etc']]
[]
[['Spinach'],['04/01/2013'],['Spinach has lots of iron']]
[['Melon'],['02/06/2013'],['Not a big fan of melon']]
At the end of which I'd home to get something like...
CSV OUTPUT
[['Spam'],['01/01/2013'],['Spam is the spammiest spam']]
[['Ham'],['01/04/2013'],['ham is ok']]
[['Lamb'],['04/01/2013'],['Welsh like lamb']]
[['Sam'],['01/12/2013'],["Sam doesn't taste as good and the last three"]]
[['Dolphin'],['01/01/2013'],['People might get angry if you eat it']]
[['Bear'],['01/04/2013'],['Best of Luck']]
[['Spinach'],['04/01/2013'],['Spinach has lots of iron']]
[['Melon'],['02/06/2013'],['Not a big fan of melon']]
So... I set about writing this:
import os
import csv
path = './Path/further/into/file/structure'
directory_list = os.listdir(path)
directory_list.sort()
archive = []
for file_name in directory_list:
temp_storage = []
path_to = path + '/' + file_name
file_data = open(path_to, 'r')
file_CSV = csv.reader(file_data)
for row in file_CSV:
temp_storage.append(row)
for row in temp_storage[3:-1]:
archive.append(row)
archive_file = open("./Path/elsewhere/in/file/structure/archive.csv", 'wb')
wr = csv.writer(archive_file)
for row in range(len(archive)):
lastrow = row
wr.writerow(archive[row])
print row
This seems to work... except when I check my output file it seems to have stopped writing at a strange point near the end"
eg:
[['Spam'],['01/01/2013'],['Spam is the spammiest spam']]
[['Ham'],['01/04/2013'],['ham is ok']]
[['Lamb'],['04/01/2013'],['Welsh like lamb']]
[['Sam'],['01/12/2013'],['Sam doesn't taste as good and the last three']]
[['Dolphin],['01/01/2013'],['People might get angry if you eat it']]
[['Bear'],['01/04/2013'],['Best of Luck']]
[['Spinach'],['04/0
It's really wierd, i can't work out what's gone wrong. Seemed to be writing fine but have decided to stop even half way through a list entry. Tracing it back I'm sure that this has something to do with my last write "for loop", but I'm not too familiar the csv methods. Have has a read through the documentation, and am still stumped.
Can anyone point out where I've gone wrong, how I might fix it and perhaps if there would be a better way of going about all this!
Many Thanks -Huw
Close the filehandle before the script ends. Closing the filehandle will also flush any strings waiting to be written. If you don't flush and the script ends, some output may never get written.
Using the
with open(...) as f
syntax is useful because it will close the file for you when Python leaves thewith
-suite. Withwith
, you'll never omit closing a file again.