I have a pandas dataframe df
of the following shape: (763, 65)
I use the following code to create 4 new columns:
df[['col1', 'col2', 'col3','col4']] = df.apply(myFunc, axis=1)
def myFunc(row):
#code to get some result from another dataframe
return result1, result2, result3, result4
The shape of the dataframe which is returned in myFunc
is (1, 4)
. The code runs into the following error:
ValueError: Shape of passed values is (763, 4), indices imply (763, 65)
I know that df
has 65 columns and that the returned data from myFunc
only has 4 columns. However, I only want to create the 4 new columns (that is, col1
, col2
, etc.), so in my opinion the code is correct when it only returns 4 columns in myFunc
. What am I doing wrong?
Demo:
Disclaimer: try to avoid using
.apply(..., axis=1)
- as it's afor loop
under the hood - i.e. it's not vectoried and will work much slower compared to vectorized Pandas/Numpy ufuncs.PS if you would provide details of what you are trying to calculate in the
myFunc
functuion, then we could try to find a vectorized solution...