Let's suppose that we have a Python function that takes in Numpy arrays and returns another array:
import numpy as np
def f(x, y, method='p'):
"""Parameters: x (np.ndarray) , y (np.ndarray), method (str)
Returns: np.ndarray"""
z = x.copy()
if method == 'p':
mask = x < 0
else:
mask = x > 0
z[mask] = 0
return z*y
although the actual implementation does not matter. We can assume that x
and y
will always be arrays of the same shape, and that the output is of the same shape as x
.
The question is what would be the simplest/most elegant way of wrapping such function so it would work with both ND arrays (N>1) and scalar arguments, in a manner somewhat similar to universal functions in Numpy.
For instance, the expected output for the above function should be,
In [1]: f_ufunc(np.arange(-1,2), np.ones(3), method='p')
Out[1]: array([ 0., 0., 1.]) # random array input -> output of the same shape
In [2]: f_ufunc(np.array([1]), np.array([1]), method='p')
Out[2]: array([1]) # array input of len 1 -> output of len 1
In [3]: f_ufunc(1, 1, method='p')
Out[3]: 1 # scalar input -> scalar output
The function
f
cannot be changed, and it will fail if given a scalar argument forx
ory
.When
x
andy
are scalars, we transform them to 1D arrays, do the calculation then transform them back to scalars at the end.f
is optimized to work with arrays, scalar input being mostly a convenience. So writing a function that work with scalars and then usingnp.vectorize
ornp.frompyfunc
would not be acceptable.
A beginning of an implementation could be,
def atleast_1d_inverse(res, x):
# this function fails in some cases (see point 1 below).
if res.shape[0] == 1:
return res[0]
else:
return res
def ufunc_wrapper(func, args=[]):
""" func: the wrapped function
args: arguments of func to which we apply np.atleast_1d """
# this needs to be generated dynamically depending on the definition of func
def wrapper(x, y, method='p'):
# we apply np.atleast_1d to the variables given in args
x = np.atleast_1d(x)
y = np.atleast_1d(x)
res = func(x, y, method='p')
return atleast_1d_inverse(res, x)
return wrapper
f_ufunc = ufunc_wrapper(f, args=['x', 'y'])
which mostly works, but will fail the tests 2 above, producing a scalar output instead of a vector one. If we want to fix that, we would need to add more tests on the type of the input (e.g. isinstance(x, np.ndarray)
, x.ndim>0
, etc), but I'm afraid to forget some corner cases there. Furthermore, the above implementation is not generic enough to wrap a function with a different number of arguments (see point 2 below).
This seems to be a rather common problem, when working with Cython / f2py function, and I was wondering if there was a generic solution for this somewhere?
Edit: a bit more precisions following @hpaulj's comments. Essentially, I'm looking for
a function that would be the inverse of
np.atleast_1d
, such asatleast_1d_inverse( np.atleast_1d(x), x) == x
, where the second argument is only used to determine the type or the number of dimensions of the original objectx
. Returning numpy scalars (i.e. arrays withndim = 0
) instead of a python scalar is ok.A way to inspect the function f and generate a wrapper that is consistent with its definition. For instance, such wrapper could be used as,
f_ufunc = ufunc_wrapper(f, args=['x', 'y'])
and then if we have a different function
def f2(x, option=2): return x**2
, we could also usef2_ufunc = ufunc_wrapper(f2, args=['x'])
.
Note: the analogy with ufuncs might be a bit limited, as this corresponds to the opposite problem. Instead of having a scalar function that we transform to accept both vector and scalar input, I have a function designed to work with vectors (that can be seen as something that was previously vectorized), that I would like to accept scalars again, without changing the original function.
This doesn't fully answer the question of making a vectorized function truly behave like a
ufunc
, but I did recently run into a slight annoyance withnumpy.vectorize
that sounds similar to your issue. That wrapper insists on returning anarray
(withndim=0
andshape=()
) even if passed scalar inputs.But it appears that the following does the right thing. In this case I am vectorizing a simple function to return a floating point value to a certain number of significant digits.
This gives