Create multiple columns from multiple return value

2019-07-12 18:30发布

问题:

Att, I want to create multiple columns from lambda function's multiple return values in python DataFrame.

Similar with the last line of my demo code.

Is there any way to achieve this?

y = np.random.rand(2,5)
df = pd.DataFrame(y, columns = ["y1", "y2", "y3", "y4", "y5"])
print(df)
def f_polyfit(y1, y2, y3, y4, y5, degree):
    y = [y1, y2, y3, y4, y5]
    x = [1, 2, 3, 4, 5]
    coeffs = np.polyfit(x, y, degree)   
    coeffs = coeffs.tolist()
    # constructe the polynomial formula
    p = np.poly1d(coeffs)
    # fit values, and mean
    y_fit = p(x)                        
    y_avg = np.sum(y)/len(y)          
    ssreg = np.sum((y_fit-y_avg)**2)   
    sstot = np.sum((y - y_avg)**2)  
    R2 = ssreg / sstot
    return coeffs[0], R2
# df["slope"], df["R2"] = zip(df.apply(lambda x:f_polyfit(x["y1"], x["y2"], x["y3"], x["y4"], x["y5"], degree = 1),  axis = 1))

回答1:

One way would be to wrap the return value in pd.Series in order to assign to new dataframe columns.

g = lambda x: pd.Series(f_polyfit(x.y1, x.y2, x.y3, x.y5, x.y5, degree=1))
df[['slope', 'R2']] = df.apply(g, axis=1)


回答2:

The solution is to understand the error message "too many values to unpack." When Python encounters an unpacking expression like:

a,b = x

it iterates over x and assigns the first value to a, the second to b, etc. If x is of exactly length 2 this statement will execute without error, but if x has three elements it will raise "too many values to unpack."

Your last line of code will only work if the iterator produced by the zip() function has EXACTLY two elements. That, apparently, is not the case. It's important to realize that the problem has nothing to do with lambda or dataframes, but with the basics of Python unpacking.