What is the difference between vectorize and frompyfunc in numpy?
Both seem very similar. What is a typical use case for each of them?
Edit: As JoshAdel indicates, the class vectorize
seems to be built upon frompyfunc
. (see the source). It is still unclear to me whether frompyfunc
may have any use case that is not covered by vectorize
...
As JoshAdel points out, vectorize
wraps frompyfunc
. Vectorize adds extra features:
- Copies the docstring from the original function
- Allows you to exclude an argument from broadcasting rules.
- Returns an array of the correct dtype instead of dtype=object
Edit: After some brief benchmarking, I find that vectorize
is significantly slower (~50%) than frompyfunc
for large arrays. If performance is critical in your application, benchmark your use-case first.
`
>>> a = numpy.indices((3,3)).sum(0)
>>> print a, a.dtype
[[0 1 2]
[1 2 3]
[2 3 4]] int32
>>> def f(x,y):
"""Returns 2 times x plus y"""
return 2*x+y
>>> f_vectorize = numpy.vectorize(f)
>>> f_frompyfunc = numpy.frompyfunc(f, 2, 1)
>>> f_vectorize.__doc__
'Returns 2 times x plus y'
>>> f_frompyfunc.__doc__
'f (vectorized)(x1, x2[, out])\n\ndynamic ufunc based on a python function'
>>> f_vectorize(a,2)
array([[ 2, 4, 6],
[ 4, 6, 8],
[ 6, 8, 10]])
>>> f_frompyfunc(a,2)
array([[2, 4, 6],
[4, 6, 8],
[6, 8, 10]], dtype=object)
`
I'm not sure what the different use cases for each is, but if you look at the source code (/numpy/lib/function_base.py), you'll see that vectorize
wraps frompyfunc
. My reading of the code is mostly that vectorize
is doing proper handling of the input arguments. There might be particular instances where you would prefer one vs the other, but it would seem that frompyfunc
is just a lower level instance of vectorize
.
Although both methods provide you a way to build your own ufunc, numpy.frompyfunc method always returns a python object, while you could specify a return type when using numpy.vectorize method