Resampling dataframe by hours and date

2019-06-03 17:19发布

I have a dataframe like this:

                 Maximum Demand (KVA)  Consumption (KVAh)  Power Factor
Timestamp                                                                     
2017-04-01 01:00:00                 127.0               261.0          0.81   
2017-04-01 02:00:00                 133.0               268.0          0.79   
2017-04-01 03:00:00                 119.0               273.0          0.92   
2017-04-01 04:00:00                 118.0               263.0          0.78   
2017-04-01 05:00:00                 135.0               271.0          0.86   
2017-04-01 06:00:00                 130.0               257.0          0.82   
2017-04-01 23:00:00                 120.0               261.0          0.78   
2017-04-02 00:00:00                 121.0               272.0          0.83   
2017-04-02 01:00:00                 126.0               263.0          0.90   
2017-04-02 02:00:00                 132.0               266.0          0.83   
2017-04-02 03:00:00                 132.0               275.0          0.90   
2017-04-02 04:00:00                 122.0               259.0          0.77   
2017-04-02 05:00:00                 119.0               271.0          0.78   
2017-04-02 06:00:00                 122.0               259.0          0.81   
2017-04-02 23:00:00                 115.0               264.0          0.87   
2017-04-03 00:00:00                 129.0               273.0          0.86 

I want to resample data by the time of 01:00 - 0:00 of another date:

I tried this:

off_sum = offpeak_hist.resample('h', base=8).sum().dropna()

But the desired output is not achieved. Please help me on this.

2条回答
混吃等死
2楼-- · 2019-06-03 17:28

I think you need first shift by one hour and then resample by days:

print (offpeak_hist.shift(-1, freq='H'))
                     Maximum Demand (KVA)  Consumption (KVAh)  Power Factor
Timestamp                                                                  
2017-04-01 00:00:00                 127.0               261.0          0.81
2017-04-01 01:00:00                 133.0               268.0          0.79
2017-04-01 02:00:00                 119.0               273.0          0.92
2017-04-01 03:00:00                 118.0               263.0          0.78
2017-04-01 04:00:00                 135.0               271.0          0.86
2017-04-01 05:00:00                 130.0               257.0          0.82
2017-04-01 22:00:00                 120.0               261.0          0.78
2017-04-01 23:00:00                 121.0               272.0          0.83
2017-04-02 00:00:00                 126.0               263.0          0.90
2017-04-02 01:00:00                 132.0               266.0          0.83
2017-04-02 02:00:00                 132.0               275.0          0.90
2017-04-02 03:00:00                 122.0               259.0          0.77
2017-04-02 04:00:00                 119.0               271.0          0.78
2017-04-02 05:00:00                 122.0               259.0          0.81
2017-04-02 22:00:00                 115.0               264.0          0.87
2017-04-02 23:00:00                 129.0               273.0          0.86


df = offpeak_hist.shift(-1, freq='H').resample('D').sum().dropna()
print (df)
            Maximum Demand (KVA)  Consumption (KVAh)  Power Factor
Timestamp                                                         
2017-04-01                1003.0              2126.0          6.59
2017-04-02                 997.0              2130.0          6.72
查看更多
太酷不给撩
3楼-- · 2019-06-03 17:32

If I understand you correctly, you want to do this:

off_sum = df.groupby(df.index.time).sum()

to achieve this:

          Maximum Demand (KVA)  Consumption (KVAh)  Power Factor
00:00:00                 250.0               545.0          1.69
01:00:00                 253.0               524.0          1.71
02:00:00                 265.0               534.0          1.62
03:00:00                 251.0               548.0          1.82
04:00:00                 240.0               522.0          1.55
05:00:00                 254.0               542.0          1.64
06:00:00                 252.0               516.0          1.63
23:00:00                 235.0               525.0          1.65

if not, you need to update your question with desired output.

查看更多
登录 后发表回答