Currently I faced an error while processing a numpy.array - 4x1 - i.e
[[-1.96113883]
[-3.46144244]
[ 5.075857 ]
[ 1.77550086]]
with the lambda function f = lambda x: x if (x > 0) else (x * 0.01)
.
The error is ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
.
I searched through the different topics here on stackoverflow.com but I have not find any satisfactory explanation of the problem and a case that suited mine (many unclear references to the and
operator, vectorized code etc.).
What I expect after processing the array is an array of the same dimensions of the input one and each single value modified according to the function, which for the example would be:
[[-0.0196113883]
[-0.0346144244]
[ 5.075857 ]
[ 1.77550086]]
Finally, can someone please provide me a solution and the explanation about why this error occurred. Thank you in advice.
x > 0
is evaluated for your numpy array as a whole, returning another array of booleans. However, the if
statement evaluates the whole array as a single operation.
arr = np.array([[-1.96113883],
[-3.46144244],
[ 5.075857 ],
[ 1.77550086]])
print arr > 0
[[False]
[False]
[ True]
[ True]]
As stated in the error message, the truth value of an array of booleans is ambigous.
Instead, as noted by ajcr in the comments, you should use np.where
for a vectorized if-else
statement
E.g.
np.where(arr > 0, arr, arr*0.01)
array([[-0.01961139],
[-0.03461442],
[ 5.075857 ],
[ 1.77550086]])
You are trying to apply your lambda function to the whole array, but what you want is to apply it to every element. There are more numpy-y solutions to this. Let your array be a
and numpy
be imported as np
. You could use fancy indexing:
>>> a_leq_0 = a <= 0
>>> a[a_leq_0] = a[a_leq_0]*0.01
>>> a
array([[-0.01961139],
[-0.03461442],
[ 5.075857 ],
[ 1.77550086]])
or even better np.where
:
>>> np.where(a > 0, a, a*0.01)
array([[-0.01961139],
[-0.03461442],
[ 5.075857 ],
[ 1.77550086]])
The explanation is in the docs of where
:
where(condition, [x, y])
[...]
If both x
and y
are specified, the output array contains elements of x
where condition
is True, and elements from
y
elsewhere.
Why not using directly a comprehension list:
np.array([list(i*0.01) if i>0 else list(i) for i in arr])
Out[28]:
array([[-1.96113883],
[-3.46144244],
[ 0.05075857],
[ 0.01775501]])
Data
arr = np.array([[-1.96113883],
[-3.46144244],
[ 5.075857 ],
[ 1.77550086]])