pandas python flag transactions across rows

I have a data as below. I would like to flag transactions -

when a same employee has one of the ('Car Rental', 'Car Rental - Gas' in the column expense type) and 'Car Mileage' on the same day - so in this case employee a and c's transactions would be highlighted. Employee b's transactions won't be highlighted as they don't meet the condition - he doesn't have a 'Car Mileage'

i want the column zflag. Different numbers in that column indicate group of instances when the above condition was met

d = {'emp': ['a',   'a',    'a',    'a',    'b',    'b',    'b',    'c',    'c',    'c',    'c' ], 
 'date': ['1',  '1',    '1',    '1',    '2',    '2',    '2',    '3',    '3',    '3',    '3' ], 
 'usd':[1,  2,  3,  4,  5,  6,  7,  8,  9,  10,     11 ], 
 'expense type':['Car Mileage',     'Car Rental',   'Car Rental - Gas',     'food',     'Car Rental',   'Car Rental - Gas',     'food',     'Car Mileage',  'Car Rental',   'food',     'wine' ],
 'zflag':['1',  '1', '1',   ' ',' ',' ',' ','2','2',' ',' ' ]
 }

df = pd.DataFrame(data=d)



    df
Out[253]: 
   date emp      expense type  usd zflag
0     1   a       Car Mileage    1     1
1     1   a        Car Rental    2     1
2     1   a  Car Rental - Gas    3     1
3     1   a              food    4      
4     2   b        Car Rental    5      
5     2   b  Car Rental - Gas    6      
6     2   b              food    7      
7     3   c       Car Mileage    8     2
8     3   c        Car Rental    9     2
9     3   c              food   10      
10    3   c              wine   11

I would appreciate if i could get pointers regarding functions to use. I am thinking of using groupby...but not sure

I understand that date+emp will be my primary key

标签： python pandas row

1条回答

老娘就宠你

2楼-- · 2019-08-06 10:04

Here is an approach. It's not the cleanest but what you're describing is quite specific. Some of this might be able to be simplified with a function.

temp_df = df.groupby(["emp", "date"], axis=0)["expense type"].apply(lambda x: 1 if "Car Mileage" in x.values and any([k in x.values for k in ["Car Rental", "Car Rental - Gas"]]) else 0).rename("zzflag")
temp_df = temp_df.loc[temp_df!=0,:].cumsum()
final_df = pd.merge(df, temp_df.reset_index(), how="left").fillna(0)

Steps:

Groupby emp/date and search for criteria, 1 if met, 0 if not
Remove rows with 0's and cumsum to produce unique values
Join back to the original frame

Edit:

To answer your question below. The join works because after you run .reset_index() that takes "emp" and "date" from the index and moves them to columns.

0人赞添加讨论(0) 举报

pandas python flag transactions across rows

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间