When writing functions to be used with groupby.apply or groupby.transform in pandas if the functions have multiple arguments, then when calling the function as part of groupby the arguments follow a comma rather than in parentheses. An example would be:
def Transfunc(df, arg1, arg2, arg2):
return something
GroupedData.transform(Transfunc, arg1, arg2, arg3)
Where the df argument is passed automatically as the first argument.
However, the same syntax does not seem to be possible when using a function to group the data. Take the following example:
people = DataFrame(np.random.randn(5, 5), columns=['a', 'b', 'c', 'd', 'e'], index=['Joe', 'Steve', 'Wes', 'Jim', 'Travis'])
people.ix[2:3, ['b', 'c']] = NA
def MeanPosition(Ind, df, Column):
if df[Column][Ind] >= np.mean(df[Column]):
return 'Greater Group'
else:
return 'Lesser Group'
# This function compares each data point in column 'a' to the mean of column 'a' and return a group name based on whether it is greater than or less than the mean
people.groupby(lambda x: MeanPosition(x, people, 'a')).mean()
The above works just fine, but I can't understand why I have to wrap the function in a lambda. Based upon the syntax used with transform and apply it seems to me that the following should work just fine:
people.groupby(MeanPosition, people, 'a').mean()
Can anyone tell me why, or how I can call the function without wrapping it in a lambda?
Thanks
EDIT: I do not think it is possible to group the data by passing a function as the key without wrapping that function in a lambda. One possible workaround is to rather than passing a function as the key, pass an array that has been created by a function. This would work in the following manner:
def MeanPositionList(df, Column):
return ['Greater Group' if df[Column][row] >= np.mean(df[Column]) else 'Lesser Group' for row in df.index]
Grouped = people.groupby(np.array(MeanPositionList(people, 'a')))
Grouped.mean()
But then of course it could be better just to cut out the middle man function all together and simply use an array with list comprhension....