How to optimize SQL query with calculating distanc

2019-02-15 00:08发布

问题:

I have a table with structure like that:

table name: shop

id_shop      int(10)
name         varchar(200)
latitude     double
longitude    double

And I'd like to calculate distance between given coordinates and coordinates saved in database.

My current query:

SELECT *
  FROM `shop` AS `s`
 WHERE
      (
        ( 6371
        * ACOS(
            SIN( RADIANS( latitude ) )
          * SIN( RADIANS( 53.5353010379 ) )
          + COS( RADIANS( latitude ) )
          * COS( RADIANS( 53.5353010379 ) )
          * COS( RADIANS( 14.7984442616 ) - RADIANS( longitude ) )
          )
        )
        <= 25
      )

plus some JOIN LEFT's for some data.

Is there any way to optimize that query ? With joins it takes about 13msec.

I need to add here also some LIMIT and COUNT(*) for total amount of shops for pagination.

回答1:

Here's a few ideas, some of which may not apply depending on your exact situation.

  1. You could do the conversion of latitude and longitude to radians and store that in the row as well. This would save the cost of those calculations (actually the cost would be accrued once when storing the data).
  2. If your table is very large, you could use a simple linear distance calculation rather than the Haversince formula to limit the results to which you apply the Haversince formula.
  3. If you have other data in the table that would serve as a good first filter (country/region/etc.), you could apply it first.
  4. You could reorder your joins so that they are applied after the distance filter so that you don't incur the cost of the join on data that doesn't qualify.


回答2:

Well, for starters, you could store pre-calculated stuff in the database when you store the latitude and longitude. If you pre-store the latitude and longitude as radians, for example, you'll only need to calculate RADIANS(latitude) and RADIANS(longitude) once, when you store each location, not every time you need to do your distance calculation (presumably far more than once.)

Maybe shave off a bit more by storing the SIN(RADIANS(latitude)) and COS(RADIANS(latitude)) when you first populate the row, too...

I'm guessing you're doing many, many "nearest thing to X" calculations over time -- that's what people are generally doing when faced with this calculation -- and pre-calculating what you can is normally the first thing to try.