I borrowed the following method from somewhere on the internet (Can't remember where). But its doing a straight forward process, finding the distance between two gps points. It works just fine, except that it may be a little slow, as I'm running it across millions of points.
I was wondering if anyone knows an approach that would be computationally less expensive.
The accuracy needs to be in the general area of 'correct' but doesn't need to be 100% accurate.
private double distFrom(double lat1, double lng1, double lat2, double lng2) {
double earthRadius = 3958.75;
double dLat = Math.toRadians(lat2-lat1);
double dLng = Math.toRadians(lng2-lng1);
double a = Math.sin(dLat/2) * Math.sin(dLat/2) +
Math.cos(Math.toRadians(lat1)) * Math.cos(Math.toRadians(lat2)) *
Math.sin(dLng/2) * Math.sin(dLng/2);
double c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1-a));
return earthRadius * c;
}
}
P.s I did indeed find a number of other relevant questions, but they don't really focus on my speed concern.
If you don't mind ignoring the slight oblateness of the Earth (and your posted Haversine code does just that anyway) consider pre-converting all of your spherical (lat/long) coordinates into 3D unit-length cartesian coordinates first, per:
http://en.wikipedia.org/wiki/Spherical_coordinate_system
Then your spherical distance between cartesian coordinates p1
and p2
is simply:
r * acos(p1 . p2)
Since p1
and p2
will have unit length this reduces to four multiplications, two additions and one inverse trig operation per pair.
Also note that the calculation of dot products is an ideal candidate for optimisation, e.g. via GPU, MMX extensions, vector libraries, etc.
Furthermore, if your intent is to order the pairs by distance, potentially ignoring more distant pairs, you can defer the expensive r*acos()
part of the equation by sorting the list just on the dot product value since for all valid inputs (i.e. the range [-1, 1]
) it's guaranteed that:
acos(x) < acos(y) if x > y
You then just take the acos()
of the values you're actually interested in.
Re: the potential inaccuracies with using acos()
, those are really only significant if you're using single-precision float
variables. Using a double
with 16 significant digits should get you distances accurate to within one metre or less.
That's the haversine algorithm, will provide you with a decent level of accuracy.
If it really is "millions" of points, perhaps implement a cache of calculations that you've made... if you come across a pair of coordinates, both of which are sufficiently close to a pair whose distance you've already calculated, then use the cached value?
Or try to cache some of the intermediate steps, e.g. degree to radians conversions.
If you sacrifice accuracy there are some improvements you can make. As far as I remember, sin(x)
is approximately equal to x
for small x
. Also it looks like you are computing the same things several times, like: Math.sin(dLat/2)
(which can actually be approximated to dLat/2
as stated above).
However if you are doing millions of these operations, I would somewhere else.
Is your algorithm optimal? Maybe you are doing too many simple computations?
If points come from database, can you perform the computations as stored procedures on the database server side?
If you are looking for closest points, can you index them somehow?
Can geospatial indexes help you?
You might try the law of cosines for spherical trigonometry:
a = sin(lat1) * sin(lat2)
b = cos(lat1) * cos(lat2) * cos(lon2 - lon1)
c = arccos(a + b)
d = R * c
But it will be inaccurate for short distances (and probably just marginally faster).
There is a complete discussion here.
However, the haversine formula is the most correct way, so aside from what others have suggested there may not be much you can do.
@Alnitak's answer may work, but spherical to Cartesian conversions are not necessarily fast.