I have one dataframe as below. At first,they have three columns('date','time','flag'). I want to add one column which based on the flag and date which means when I get flag=1 ,then the rest of this day the target is 1, otherwise the target is zero.
date time flag target
0 2017/4/10 10:00:00 0 0
1 2017/4/10 11:00:00 1 1
2 2017/4/10 12:00:00 0 1
3 2017/4/10 13:00:00 0 1
4 2017/4/10 14:00:00 0 1
5 2017/4/11 10:00:00 1 1
6 2017/4/11 11:00:00 0 1
7 2017/4/11 12:00:00 1 1
8 2017/4/11 13:00:00 1 1
9 2017/4/11 14:00:00 0 1
10 2017/4/12 10:00:00 0 0
11 2017/4/12 11:00:00 0 0
12 2017/4/12 12:00:00 0 0
13 2017/4/12 13:00:00 0 0
14 2017/4/12 14:00:00 0 0
15 2017/4/13 10:00:00 0 0
16 2017/4/13 11:00:00 1 1
17 2017/4/13 12:00:00 0 1
18 2017/4/13 13:00:00 1 1
19 2017/4/13 14:00:00 0 1
Use
DataFrameGroupBy.cumsum
for cumulative sumflag
values, compare with0
and last cast mask tointeger
:Okay, I know that we've already found a solution here but just to satisfy the nerd in me, here's an answer (not elegant given how long it is) to avoid that nagging first-row flaw
Approach remains the same as @jezrael - the groupby function is key here. Instead of using the cumsum, which leads to the first-row flaw,
any()
appears to fit really well into this solution. The only drawback is that it produces a series, which we then need to coerce back into a dataframe and transpose before joining them together by the date key.