I have a DataFrame with start_date column of date type. Now I have to generate metrics for unique values in column1 with start_date before or equal. Following is a input DataFrame
column1 column2 start_date
id1 val1 2018-03-12
id1 val2 2018-03-12
id2 val3 2018-03-12
id3 val4 2018-03-12
id4 val5 2018-03-11
id4 val6 2018-03-11
id5 val7 2018-03-11
id5 val8 2018-03-11
id6 val9 2018-03-10
Now I have to convert into following,
start_date count
2018-03-12 6
2018-03-11 3
2018-03-10 1
This is what I am doing which is not efficient way,
- finding out all distinct start_dates and storing as a list
- looping over list and generating output for each start_date
- combining all outputs into one dataframe.
Is there a better way of doing it without looping ?