I am using Pandas to structure and process Data. This is my DataFrame:
I want to do a resampling of time-series data, and have, for every ID (named here "3"), all bitrate scores, from beginning to end (beginning_time / end_time). For exemple, for the first row, I want to have all seconds, from 2016-07-08 02:17:42 to 2016-07-08 02:17:55, with the same bitrate score, and the same ID of course. Something like this :
For example, given :
df = pd.DataFrame(
{'Id' : ['CODI126640013.ts', 'CODI126622312.ts'],
'beginning_time':['2016-07-08 02:17:42', '2016-07-08 02:05:35'],
'end_time' :['2016-07-08 02:17:55', '2016-07-08 02:26:11'],
'bitrate': ['3750000', '3750000']})
which gives :
And I want to have for the first row :
Same thing for the secend row.. So the objectif is to resample the deltaTime between the beginning and the end times, the bitrate score must be the same of course.
I'm trying this code:
df['new_beginning_time'] = pd.to_datetime(df['beginning_time'])
df.set_index('new_beginning_time').groupby('Id', group_keys=False).apply(lambda df: df.resample('S').ffill()).reset_index()
But in this context, it didn't work ! Any ideas ? Thank you very much !
You can use
melt
withresample
- 0.18.1 version of pandas:This should do the trick
edit: I am using python 2.7, python 3 as a different zip()