Foursquare API: Getting an exhaustive list of venu

2019-03-14 03:34发布

I'm using Foursquare API to get a list of venues of a certain category. One important requirement is that the list is exhaustive, i.e. includes all relevant points. The v2/venues/search API endpoint enforces a limit of 50 venues on the output. So the first idea that comes to mind is splitting the area into several sections (using "sw" and "ne" params) and then combining the results.

Clearly, the density of points will vary dramatically depending on location, so we'll need to use some kind of adaptive algorithm to flexibly adjust the size of the search window so that it contains all points. Also, there's an increased risk of running into the rate limit, so we might need the algorithm to stop when it's used up its quota of requests.

Finally, it seems that the only way to tell if a search window should be shrunk even further is to count the number of points in the result: if we have less than 50, then we've got a complete list for this section and can move on to the next one; otherwise, we should split it further. It seems to be wasteful as we'll be throwing away the intermediate results (i.e. all results in our search tree except for the leaves).

So here are some questions that I have:

  • Is it the best way to put together an exhaustive list? Maybe I'm missing some API functionality?
  • Is there any specific algorithm you'd use in this case?
  • How would you go about reducing the number of results that have to be thrown away?

Thanks in advance!

2条回答
爷的心禁止访问
2楼-- · 2019-03-14 04:14

An important disclaimer would be that foursquare does not like it when you perform a lot of searches in the same area.

Having said that, you should look into experimenting with categoryId filter in the venue search api. Most of the data on foursquare is food (restaurants) and nightlife related.

So if you exclude these (by including others, no way to exclude) you can search on a larger area and still get below 50 results.

Never really tried using such an algorithm because the categoryId filtering worked good enough, but in theory, the algorithm is simple, each lat/lng 0.001 is ~111 meters.

Search using a small radius (~200 for large metropolitan areas) and triangulate (scan) areas.

What got us to originally perform a lot of searches (and later stop doing so) is that sometimes foursquare filter out results without asking you (for me, it looks like bugs, for them its part of the algorithm). So for example I would search on a 50 meter radius, find the place I want (I know what I am searching for), expand to 500 meters, not find it (and get less than 50 results - so it was not dropped out because I hit the cap, it was dropped out because ???), move my search location ~300 meters north, find it -> sporadic behavior.

My point is (and the reason for why we stopped making a lot of searches and changed our approach), what you are trying to achieve, 'complete coverage' is very hard to do given the current API and the current usage policy, and -> it is not important really. After a few months of playing with it, we figured out that we should query foursqaure for what our users are looking for and require at this moment, we cache the results - over time we will have a complete coverage, maybe at start we will miss a few spots, but for the long run its not really important.

查看更多
狗以群分
3楼-- · 2019-03-14 04:15

Hopefully this is not what you're doing, but as a friendly reminder: scraping foursquare's website and/or API is very much prohibited by its terms of service.

查看更多
登录 后发表回答