I have a data as below. I would like to flag transactions -
when a same employee has one of the ('Car Rental', 'Car Rental - Gas'
in the column expense type
) and 'Car Mileage'
on the same day - so in this case employee a
and c
's transactions would be highlighted. Employee b
's transactions won't be highlighted as they don't meet the condition - he doesn't have a 'Car Mileage'
i want the column zflag
. Different numbers in that column indicate group of instances when the above condition was met
d = {'emp': ['a', 'a', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'c', 'c' ],
'date': ['1', '1', '1', '1', '2', '2', '2', '3', '3', '3', '3' ],
'usd':[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 ],
'expense type':['Car Mileage', 'Car Rental', 'Car Rental - Gas', 'food', 'Car Rental', 'Car Rental - Gas', 'food', 'Car Mileage', 'Car Rental', 'food', 'wine' ],
'zflag':['1', '1', '1', ' ',' ',' ',' ','2','2',' ',' ' ]
}
df = pd.DataFrame(data=d)
df
Out[253]:
date emp expense type usd zflag
0 1 a Car Mileage 1 1
1 1 a Car Rental 2 1
2 1 a Car Rental - Gas 3 1
3 1 a food 4
4 2 b Car Rental 5
5 2 b Car Rental - Gas 6
6 2 b food 7
7 3 c Car Mileage 8 2
8 3 c Car Rental 9 2
9 3 c food 10
10 3 c wine 11
I would appreciate if i could get pointers regarding functions to use. I am thinking of using groupby...but not sure
I understand that date
+emp
will be my primary key