I'm using following sql code to find out 'ALL' poi closest to the set coordinates, but I would want to find out specific poi instead of all of them. When I try to use the where clause I get an error and it doesn't work and this is where I'm currently stuck, since I only use one table for all the coordinates off all poi's.
SET @orig_lat=55.4058;
SET @orig_lon=13.7907;
SET @dist=10;
SELECT
*,
3956 * 2 * ASIN(SQRT(POWER(SIN((@orig_lat -abs(latitude)) * pi()/180 / 2), 2)
+ COS(@orig_lat * pi()/180 ) * COS(abs(latitude) * pi()/180)
* POWER(SIN((@orig_lon - longitude) * pi()/180 / 2), 2) )) as distance
FROM geo_kulplex.sweden_bobo
HAVING distance < @dist
ORDER BY distance limit 10;
The problem is that you can not reference an aliased column (
distance
in this case) in aselect
orwhere
clause. For example, you can't do this:This will fail in both: the
select
statement when trying to processNewCol + 1
and also in thewhere
statement when trying to processNewCol = 2
.There are two ways to solve this:
1) Replace the reference by the calculated value itself. Example:
2) Use an outer
select
statement:Now, given your HUGE and not very human-friendly calculated column :) I think you should go for the last option to improve readibility:
Edit: As @Kaii mentioned below this will result in a full table scan. Depending on the amount of data you will be processing you might want to avoid that and go for the first option, which should perform faster.
The reason why you cant use your alias in the
WHERE
clause is the order in which MySQL executes things:FROM
WHERE
GROUP BY
HAVING
SELECT
ORDER BY
When executing your
WHERE
clause, the value for your column alias is not yet calculated. This is a good thing, because it would waste a lot of performance. Imagine many (1,000,000) rows -- to use your calculation in theWHERE
clause, each of those 1,000,000 would first have to be fetched and calculated so theWHERE
condition can compare the calculation results to your expectation.You can do this explicitly by either
HAVING
(thats the reason whyHAVING
has another name asWHERE
- its a different thing)WHERE
clause (will effectively give the same performance result asHAVING
)All those will perform almost equally bad: each row is fetched first, the distance calculated and finally filtered by distance before sending the result to the client.
You can gain much (!) better performance by mixing a simple
WHERE
clause for distance approximation (filtering rows to fetch first) with the more precise euclidian formula in aHAVING
clause.@distance = 10
condition using aWHERE
clause based on simple X and Y distance (bounding box) -- this is a cheap operation.HAVING
clause -- this is an expensive operation.Look at this query to understand what i mean:
For those who are interested in the constant:
Find more information in the wiki about the Haversine formula