可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I am trying to connect google's geocode api and github api to parse user's location and create a list out of it.
The array (list) I want to create is like this:
location, lat, lon, count
San Francisco, x, y, 4
Mumbai, x1, y1, 5
Where location, lat and lon is parsed from Google geocode, count is the occurrence of that location. Eevery time a new location is added: if it exists in the list the count is incremented otherwise it is appended to the array(list) with location, lat, lon and the count should be 1.
Another example:
location, lat, lon, count
Miami x2, y2, 1 #first occurrence
San Francisco, x, y, 4 #occurred 4 times already
Mumbai, x1, y1, 5 #occurred 5 times already
Cairo, x3, y3, 1 #first occurrence
I can already get the user's location from github and can get the geocoded data from google. I just need to create this array in python which I'm struggling with.
Can anyone help me? thanks.
回答1:
With collections.Counter
, you could do :
from collections import Counter
# initial values
c=Counter({("Mumbai", 1, 2):5, ("San Francisco", 3,4): 4})
#adding entries
c.update([('Mumbai', 1, 2)])
print c # Counter({('Mumbai', 1, 2): 6, ('San Francisco', 3, 4): 4})
c.update([('Mumbai', 1, 2), ("San Diego", 5,6)])
print c #Counter({('Mumbai', 1, 2): 7, ('San Francisco', 3, 4): 4, ('San Diego', 5, 6): 1})
回答2:
This would be better stored as a dictionary, indexed by city name. You could store it as two dictionaries, one dictionary of tuples for latitude/longitude (since lat/long never changes):
lat_long_dict = {}
lat_long_dict["San Francisco"] = (x, y)
lat_long_dict["Mumbai"] = (x1, y1)
And a collections.defaultdict
for the count, so that it always starts at 0:
import collections
city_counts = collections.defaultdict(int)
city_counts["San Francisco"] += 1
city_counts["Mumbai"] += 1
city_counts["San Francisco"] += 1
# city counts would be
# defaultdict(<type 'int'>, {'San Francisco': 2, 'Mumbai': 1})
回答3:
Python has a pre-baked class specifically for counting occurences of things: its called collections.Counter
. If you can generate an iterator that gives successive tuples (city, lat, lon)
from your input data (perhaps with a generator expression), simply passing that into Counter
will directly give you what you're looking for. eg,
>>> locations = [('Miami', 1, 1), ('San Francisco', 2, 2), ('Mumbai', 3, 3), ('Miami', 1, 1), ('Miami', 1, 1)]
>>> Counter(locations)
Counter({('Miami', 1, 1): 3, ('San Francisco', 2, 2): 1, ('Mumbai', 3, 3): 1})
If you need to be able to add more locations as the program runs instead of batching them, put the relevant tuples into that Counter's update
method.
回答4:
This is sort of an amalgamation of all the other recommended ideas:
from collections import defaultdict
inputdata = [('Miami', 'x2', 'y2'),
('San Francisco', 'x', 'y'),
('San Francisco', 'x4', 'y4'),
('Mumbai', 'x1', 'y1'),
('Cairo', 'x3', 'y3')]
counts, coords = defaultdict(int), defaultdict(list)
for location, lat, lon in inputdata:
coords[location].append((lat,lon))
counts[location] += 1
print counts, coords
This uses defaultdict, which, as you can see allows for an easy way to both:
- count the number of occurrences by city
- keep lat/lon pairs intact
RETURNS:
defaultdict(<type 'int'>, {'Miami': 1, 'San Francisco': 2, 'Cairo': 1, 'Mumbai': 1})
defaultdict(<type 'list'>, {'Miami': [('x2', 'y2')], 'San Francisco': [('x', 'y'), ('x4', 'y4')], 'Cairo': [('x3', 'y3')], 'Mumbai': [('x1', 'y1')]})
This answer makes an (unverified) assumption that the granularity of your lat/lon pairs are unlikely to repeat, but that in fact you're only interested in making counts-by-city.
回答5:
How about using a python dict? You can read about them here
http://docs.python.org/2/tutorial/datastructures.html#dictionaries
Here is a sample implementation:
// Create an empty dictionary.
dat = {}
if dat.has_key(location):
dat[location] = dat[location] + 1
else:
dat[location] = 1