Pandas resampling is really convenient if your indices use datetime indexing, but I haven't found an easy implementation to resample by an arbitrary factor. E.g., just treat each index as an arbitrary index, and resample the dataframe so that its resulting length is 4X shorter (and being more intelligent about it than just taking every 4th datapoint).
This would be useful for anyone that's working with data that operates on a much shorter timescale than datetimes. For example, in my case I want to resample an audio vector from 44KHz to 11KHz. Right now I have to use scipy's "decimate" function, and then re-convert it back to a dataframe (using dataframe.apply wasn't working because it changes the length of the dataframe).
Anyone have any suggestions for how to accomplish this?
You can use
DatetimeIndex
to resample high frequency data (up to nanosecond precision, caveat: I believe this is only available in the upcoming 0.13 release). I've successfully used pandas to resample electrophysiological data in the 24KHz range. Here's an example:You can pass in a callable to
how
, which would allow you to "do something more intelligent".pandas
defaults to taking the average over the period given (in this case, that's the average over each chunk of 22727 samples).