I have a list of coordinates and can calculate a distance matrix among all points using the haversine distance metric.
Coordinates come a as numpy.array
of shape (n, 2)
of (latitude, longitude)
pairs:
[[ 16.34576887 -107.90942116]
[ 12.49474931 -107.76030036]
[ 27.79461514 -107.98607881]
...
[ 12.90258404 -107.96786569]
[ -6.29109889 -107.88681145]
[ -2.68531605 -107.72796034]]
I can also extract the distance along the path implied by the sequence of coordinates like so:
coordinates = np.deg2rad(coordinates)
lat, lng = coordinates[:, 0], coordinates[:, 1]
diff_lat = lat[:, None] - lat
diff_lng = lng[:, None] - lng
d = np.sin(diff_lat / 2) ** 2 + np.cos(lat[:, None]) * np.cos(lat) * np.sin(diff_lng / 2) ** 2
dist_matrix = 2 * 6371 * np.arcsin(np.sqrt(d))
np.diagonal(dist_matrix, offset=1)
[ 428.51472359 1701.42935402 1849.52714339 12707.47743385
13723.9087041 4521.8250695 2134.258953 401.33113696
4571.69119707 73.82631307 6078.48898641 9870.17140175
...
2109.57319898 12959.56540448 16680.64546196 3050.96912506
3419.95053226 4209.71641445 9467.85523888 2805.65191129
4120.18701177]
I would like to only calculate the distance vector as opposed to the entire matrix and then selecting the relevant diagonal.
Here's one way you can vectorize that calculation without creating a big matrix.
coslat
is the array of cosines of the latitudes, andcoslat[:-1]*coslat[1:]
is the vectorized version of the expression cos(ϕ1)cos(ϕ2) in the Haversine formula.Output: