How to parse dates for comparison time range compa

2019-08-06 09:26发布

I have an Excel file with two sheets.

('sheet 1') containing the data:

DATE       TMAX TMIN
20110706    317 211
20110707    322 211
20110708    317 211
20110709    322 211
20110710    328 222
20110711    333 244
20110712    356 250
20110713    356 222

and the other ('sheet 2') includes:

Start Date  End Date    Rep Month    Cost    kWh     kW 
7/6/2011    8/3/2011    July     5,065.17    76,640      205 
8/3/2011    9/7/2011    August   5,572.38    86,640      195 

My goal is to write another column on ('sheet 1') for kwh values what fall within the range of a specific dates from ('sheet 2')

For example:

DATE        TMAX    TMIN    kWh
20110706    317   211   76640
20110707    322   211   76640
20110708    317   211   76640
20110709    322   211   76640
20110710    328   222   76640
20110711    333   244   76640
20110712    356   250   76640
20110713    356   222   76640
20110801    344   228   76640
20110802    356   200   76640
20110803    367   200   86640
20110804    361   228   86640

I do not know why my code is resulting in df["kWh"] being empty ('NaN') which results in a blank kWh column written in ('sheet1')

Here is my code below:

import pandas as pd
from pandas import ExcelWriter

df = pd.read_excel("thecddhddtest.xlsx",'Sheet1')
df2 = pd.read_excel("thecddhddtest.xlsx",'Sheet2')
df.head()


df["DATE"] = pd.to_datetime(df["DATE"], format="%Y%m%d")
pd.to_datetime(df2["Start Date"], format="%m/%d/%Y")

df3 = df2.set_index("Start Date")

df["DATE"] = pd.to_datetime(df["DATE"], format="%Y%m%d")
df2["Start Date"] = pd.to_datetime(df2["Start Date"], format="%m/%d/%Y")

df3["kWh"].reindex(df["DATE"], method="ffill")
df["kWh"] = df3["kWh"].reindex(df["DATE"], method="ffill")
print(df["kWh"])


writer = ExcelWriter('thecddhddtestkWh.xlsx')
df.to_excel(writer,'Sheet1',index=False)
df2.to_excel(writer,'Sheet2',index=False)
writer.save()

which results in:

DATE       TMAX TMIN kWh
20110706    317 211
20110707    322 211
20110708    317 211
20110709    322 211
20110710    328 222
20110711    333 244
20110712    356 250
20110713    356 222

1条回答
混吃等死
2楼-- · 2019-08-06 10:04

Try my solution - only set column DATE in df to index and then reindex by this.

df["DATE"] = pd.to_datetime(df["DATE"], format="%Y%m%d")
#set column DATE to index
df = df.set_index("DATE")
df2["Start Date"] = pd.to_datetime(df2["Start Date"], format="%m/%d/%Y")
df3 = df2.set_index("Start Date")
#reindex by index of df
df["kWh"] = df3["kWh"].reindex(df.index, method="ffill")
print(df["kWh"])
#DATE
#2011-07-06    76,640
#2011-07-07    76,640
#2011-07-08    76,640
#2011-07-09    76,640
#2011-07-10    76,640
#2011-07-11    76,640
#2011-07-12    76,640
#2011-07-13    76,640
#2011-08-01    76,640
#2011-08-02    76,640
#2011-08-03    86,640
#2011-08-04    86,640
#Name: kWh, dtype: object
查看更多
登录 后发表回答