Pandas - New Row for Each Day in Date Range

I have a Pandas df with one column (Reservation_Dt_Start) representing the start of a date range and another (Reservation_Dt_End) representing the end of a date range.

Rather than each row having a date range, I'd like to expand each row to have as many records as there are dates in the date range, with each new row representing one of those dates.

See the two pics below for an example input and the desired output.

The code snippet below works!! However, for every 250 rows in the input table, it takes 1 second to run. Given my input table is 120,000,000 rows in size, this code will take about one week to run.

pd.concat([pd.DataFrame({'Book_Dt': row.Book_Dt,
                         'Day_Of_Reservation': pd.date_range(row.Reservation_Dt_Start, row.Reservation_Dt_End),
                         'Pickup': row.Pickup,
                         'Dropoff' : row.Dropoff,
                         'Price': row.Price}, 

                          columns=['Book_Dt','Day_Of_Reservation', 'Pickup', 'Dropoff' , 'Price']) 
                          for i, row in df.iterrows()], ignore_index=True)

There has to be a faster way to do this. Any ideas? Thanks!

标签： python pandas

1条回答

The star\"

2楼-- · 2019-09-20 06:46

pd.concat in a loop with a large dataset gets pretty slow as it will make a copy of the frame each time and return a new dataframe. You are attempting to do this 120m times. I would try to work with this data as a simple list of tuples instead then convert to dataframe at the end.

e.g.

Given a list list = []

For each row in the dataframe:

get list of date range (can use pd.date_range here still) store in variable dates which is a list of dates
for each date in date range, add a tuple to the list list.append((row.Book_Dt, dates[i], row.Pickup, row.Dropoff, row.Price))

Finally you can convert the list of tuples to a dataframe:

df = pd.DataFrame(list, columns = ['Book_Dt', 'Day_Of_Reservation', 'Pickup', 'Dropoff', 'Price'])

0人赞添加讨论(0) 举报

Pandas - New Row for Each Day in Date Range

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间