This question already has answers here:
Closed 6 years ago.
So I have a few csv files in in the following format:
person,age,nationality,language
Jack,18,Canadian,English
Rahul,25,Indian,Hindi
Mark,50,American,English
Kyou, 21, Japanese, English
I need to import that, and return that data as a dictionary, with the keys as the column headings in the first row, and all the data in each column as values for that specific key. For example:
dict = {
'person': ['Jack', 'Rahul', 'Mark', 'Kyou'],
'age': [18, 25, 50, 21],
'nationality': ['Canadian', 'Indian', 'American', 'Japanese'],
'language': ['English', 'Hindi', 'English', 'English']
}
Any idea how I would begin this code and make it so that the code would work for any number of columns given in a .csv file?
Here is a fairly straightforward solution that uses the python CSV module (DOCs here: http://docs.python.org/2/library/csv.html). Just replace 'csv_data.csv' with the name of you CSV file.
import csv
with open('csv_data.csv') as csv_data:
reader = csv.reader(csv_data)
# eliminate blank rows if they exist
rows = [row for row in reader if row]
headings = rows[0] # get headings
person_info = {}
for row in rows[1:]:
# append the dataitem to the end of the dictionary entry
# set the default value of [] if this key has not been seen
for col_header, data_column in zip(headings, row):
person_info.setdefault(col_header, []).append(data_column)
print person_info
I'd go for something like:
import csv
with open('input') as fin:
csvin = csv.reader(fin)
header = next(csvin, [])
print dict(zip(header, zip(*csvin)))
# {'person': ('Jack', 'Rahul', 'Mark', 'Kyou'), 'age': ('18', '25', '50', ' 21'), 'language': ('English', 'Hindi', 'English', ' English'), 'nationality': ('Canadian', 'Indian', 'American', ' Japanese')}
Adapt accordingly.
Using the csv module, I would do it this way:
with open('somefile.csv', 'rb') as input_file:
reader = csv.DictReader(input_file)
results = {}
for linedict in reader:
for (key, value) in linedict.iteritems():
results.setdefault(key, []).append(value)
You could use zipping combined with slicing in a dict comprehension, once you've gotten the data in to a list of lists with the csv module.
{col[0] : col[1:] for col in zip(*rows)}