Interpolating one time series onto another in pand

2020-06-17 16:25发布

问题:

I have one set of values measured at regular times. Say:

import pandas as pd
import numpy as np
rng = pd.date_range('2013-01-01', periods=12, freq='H')
data = pd.Series(np.random.randn(len(rng)), index=rng)

And another set of more arbitrary times, for example, (in reality these times are not a regular sequence)

ts_rng = pd.date_range('2013-01-01 01:11:21', periods=7, freq='87Min')
ts = pd.Series(index=ts_rng)

I want to know the value of data interpolated at the times in ts.
I can do this in numpy:

x = np.asarray(ts_rng,dtype=np.float64)
xp = np.asarray(data.index,dtype=np.float64)
fp = np.asarray(data)
ts[:] = np.interp(x,xp,fp)

But I feel pandas has this functionality somewhere in resample, reindex etc. but I can't quite get it.

回答1:

You can concatenate the two time series and sort by index. Since the values in the second series are NaN you can interpolate and the just select out the values that represent the points from the second series:

 pd.concat([data, ts]).sort_index().interpolate().reindex(ts.index)

or

 pd.concat([data, ts]).sort_index().interpolate()[ts.index]


回答2:

Assume you would like to evaluate a time series ts on a different datetime_index. This index and the index of ts may overlap. I recommend to use the following groupby trick. This essentially gets rid of dubious double stamps. I then forward interpolate but feel free to apply more fancy methods

def interpolate(ts, datetime_index):
    x = pd.concat([ts, pd.Series(index=datetime_index)])
    return x.groupby(x.index).first().sort_index().fillna(method="ffill")[datetime_index]


回答3:

Here's a clean one liner:

ts = np.interp( ts_rng.asi8 ,data.index.asi8, data[0] )