If I have a CSV file that has a dictionary value for each line (with columns being ["Location"], ["MovieDate"], ["Formatted_Address"], ["Lat"], ["Lng"]), I have been told to use OrderDict if I want to group by Location
and append on all the MovieDate
values that share the same Location
value.
ex of data:
Location,MovieDate,Formatted_Address,Lat,Lng
"Edgebrook Park, Chicago ",Jun-7 A League of Their Own,"Edgebrook Park, 6525 North Hiawatha Avenue, Chicago, IL 60646, USA",41.9998876,-87.7627672
"Edgebrook Park, Chicago ","Jun-9 It's a Mad, Mad, Mad, Mad World","Edgebrook Park, 6525 North Hiawatha Avenue, Chicago, IL 60646, USA",41.9998876,-87.7627672
For every row that has the same location (^as in this example), i'd like to make an output like this so that there are no duplicate locations.
"Edgebrook Park, Chicago ","Jun-7 A League of Their Own Jun-9 It's a Mad, Mad, Mad, Mad World","Edgebrook Park, 6525 North Hiawatha Avenue, Chicago, IL 60646, USA",41.9998876,-87.7627672
What's wrong with my code using ordereddict to do this?
from collections import OrderedDict
od = OrderedDict()
import csv
with open("MovieDictFormatted.csv") as f,open("MoviesCombined.csv" ,"w") as out:
r = csv.reader(f)
wr = csv.writer(out)
header = next(r)
for row in r:
loc,rest = row[0], row[1]
od.setdefault(loc, []).append(rest)
wr.writerow(header)
for loc,vals in od.items():
wr.writerow([loc]+vals)
What I end up with is something like this:
['Edgebrook Park, Chicago ', 'Jun-7 A League of Their Own']
['Gage Park, Chicago ', "Jun-9 It's a Mad, Mad, Mad, Mad World"]
['Jefferson Memorial Park, Chicago ', 'Jun-12 Monsters University ', 'Jul-11 Frozen ', 'Aug-8 The Blues Brothers ']
['Commercial Club Playground, Chicago ', 'Jun-12 Despicable Me 2']
The issue is that I'm not getting the other columns to show up in this case, how would I best do that? I would also prefer to make the MovieDate values just one long string as here:
'Jun-12 Monsters University Jul-11 Frozen Aug-8 The Blues Brothers '
instead of :
'Jun-12 Monsters University ', 'Jul-11 Frozen ', 'Aug-8 The Blues Brothers '
thanks guys, appreciate it. I'm a python noob.
Changing row[0], row[1]
to row[0], row[1:]
unfortunately doesn't give me what I want.. I only want to be adding the values in the second column (MovieDate), not replicating all the other columns as such:
['Jefferson Memorial Park, Chicago ', ['Jun-12 Monsters University ', 'Jefferson Memorial Park, 4822 North Long Avenue, Chicago, IL 60630, USA', '41.76083920000001', '-87.6294353'], ['Jul-11 Frozen ', 'Jefferson Memorial Park, 4822 North Long Avenue, Chicago, IL 60630, USA', '41.76083920000001', '-87.6294353'], ['Aug-8 The Blues Brothers ', 'Jefferson Memorial Park, 4822 North Long Avenue, Chicago, IL 60630, USA', '41.76083920000001', '-87.6294353']]