Large MySQL DB (21MM records) with location data -

2019-08-07 06:41发布

问题:

We have a large location DB - with lat long specified for each row. The DB is hosted in MySQL.

We need to run two type of queries:

  • places nearby (sort by distance)
  • places nearby by category (where category is a column)

With the number of records growing, this query seems to slow down drastically.

SELECT *, ( 3959 * acos( cos( radians(40.759105) ) * cos( radians( Latitude ) ) * cos( radians( longitude) - radians(-73.984654) ) + sin( radians(40.759105) ) * sin( radians( Latitude ) ) ) ) as distance FROM mcw_in WHERE Latitude <> '' ORDER BY distance LIMIT 0,20

How can I create an index in MySQL to address the slowness? Is there any other solution - like using any geospatial data types?

回答1:

MySQL Manual :: Introduction to MySQL Spatial Support

MySQL Manual :: Creating Spatial Indexes



回答2:

But really this won't work in MySQL since they haven't really implemented the functions.

If you are open to it, I would reccomend using PostGIS or Spatialiate (running on Postgresql and SQLLite respectively) or even mongodb or geocouch. These have a much larger suite of implemented spatial functions. If you look at the MySQL documentation it mostly says "not implemented" for the spatial functions.



回答3:

It is better to use range queries by defining a bounding box surrounding the center. The following query searches the nearest 20 locations within distance $dist from the center ($lat0, $lng0), with the result sorted by the distance. You need two indexes, one on 'lat' and one on 'lng'. Some explanations can be found here.

SELECT *,
    ( 6371 * acos(
    cos(radians($lat0)) * cos(radians(lat)) * cos(radians(lng) - radians($lng0)) +
    sin(radians($lat0)) * sin(radians(lat))
    ) ) AS distance
FROM `locations`
WHERE lat < degrees( asin( sin(radians($lat0)) * cos($dist / 6371) +
        cos(radians($lat0)) * sin($dist / 6371) * cos(radians(0)) ))
  AND lat > degrees( asin( sin(radians($lat0)) * cos($dist / 6371) +
        cos(radians($lat0)) * sin($dist / 6371) * cos(radians(180)) ))
  AND lng < $lng0 - degrees( atan2(sin(radians(90)) * sin(radians($dist / 6371)) * cos(radians($lat0)),
        cos(radians($dist / 6371)) - sin(radians($lat0)) * sin(radians($lat0))) )
  AND lng > $lng0 + degrees( atan2(sin(radians(90)) * sin(radians($dist / 6371)) * cos(radians($lat0)),
        cos(radians($dist / 6371)) - sin(radians($lat0)) * sin(radians($lat0))) )
ORDER BY distance LIMIT 20;