I got a dataframe with two columns that are holding Longitude and Latitude coordinates:
import pandas as pd
values = {'Latitude': {0: 47.021503365600005,
1: 47.021503365600005,
2: 47.021503365600005,
3: 47.021503365600005,
4: 47.021503365600005,
5: 47.021503365600005},
'Longitude': {0: 15.481974060399999,
1: 15.481974060399999,
2: 15.481974060399999,
3: 15.481974060399999,
4: 15.481974060399999,
5: 15.481974060399999}}
df = pd.DataFrame(values)
df.head()
Now I want to apply a rolling window function on the dataframe that takes the Longitude AND Latitude (two columns) of one row and another row (window size 2) in order to calculate the haversine distance.
def haversine_distance(x):
print (x)
df.rolling(2, axis=1).apply(haversine_distance)
My problem is that I never get all four values Lng1, Lat1 (first row) and Lng2, Lat2 (second row). If I use axis=1, then I will get Lng1 and Lat1 of the first row. If I use axis=0, then I will get Lng1 and Lng2 of the first and second row, but Longitude only.
How can I apply a rolling window using two rows and two columns? Somewhat like this:
def haversine_distance(x):
row1 = x[0]
row2 = x[1]
lng1, lat1 = row1['Longitude'], row1['Latitude']
lng2, lat2 = row2['Longitude'], row2['Latitude']
# do your stuff here
return 1
Currently I'm doing this calculation by joining the dataframe with itself by shift(-1) resulting in all four coordinates in one line. But it should be possible with rolling as well. Another option is combining Lng and Lat into one column and apply rolling with axis=0 onto that. But there must be an easier way, right?
Since pandas v0.23 it is now possible to pass a
Series
instead of andarray
to Rolling.apply(). Just setraw=False
.So building on your given example, you could move the latitude to the index and pass the whole longitude series---including the index---to your function: