Python: Add function to an array in a FOR loop

2020-07-29 03:26发布

问题:

Maybe this is a simple issue, but I could not find any information about it so far. For an optimization in numpy I need an array of functions. The number of functions I need depends on the current object which shall be optimized. I have already figured out how to create these functions dynamically, but now I would like to store them in an array like this:

myArray = zeros(x)   
for i in range(x):
  myArray[i] = createFunction(i)

If I run this I get a type mismatch: float() argument must be a string or a number, not 'function'

Creating the array directly works well:

  myArray = array([createFunction(0)...])

But because I don't know the number of functions I need, this is exactly what I want to prevent.

回答1:

Ah, I get it. You really do mean an array of functions.

The type mismatch error arises because the call to zeros creates an array of floats by default. So your original would work if instead you did myArray = numpy.empty(x, dtype=numpy.object) (note that empty makes more sense than zeros here). The slightly more pythonic version is to use a list comprehension

myArray = numpy.array([createFunction(i) for i in range(x)]).

But you might not need to create a numpy array at all, depending on what you want to do with it:

myArray = [createFunction(i) for i in range(x)]

If you want to avoid the list, it might be better to use numpy.fromfunction along with numpy.vectorize:

myArray = numpy.fromfunction(numpy.vectorize(createFunction), 
                             shape=(x,), dtype=numpy.object)

where (x,) is a tuple giving the shape of the array. The call to vectorize is needed because fromfunction assumes that the function can work on an array of inputs and return an array of scalars, and vectorize converts a function to do exactly that. The dtype=object is needed since otherwise numpy tries to create an array of floats.



回答2:

Maybe you can use

myArray = array([createFunction(i) for i in range(x)])


回答3:

If you need an array of functions, is it possible to not use NumPy? NumPy arrays have C-style types and it defaults to float. If you can, just use a standard Python list. But if you absolutely must use NumPy, try defining the array like so:

import numpy as np
a = np.empty([x], dtype=np.dtype(np.object_))

Or however you need it to be with that dtype.



回答4:

Numpy arrays are homogeneous. That is all elements of a numpy array are of the same type -- python is duck-typed, numpy isn't. This is part of what makes matrix operations on numpy arrays and matrices so fast. However, because of this a data type must be known when the array is first created. Numpy is generally very good at inferring the data type. The problem comes when creating an empty or zeroed array. Since there are no elements to examine numpy must guess the data type. Numpy defaults to numpy.float64 if it isn't given a data type at array creation time. This is a decent choice as numpy is typically used in scientific or engineering areas where floating point numbers are required. This is also why numpy is complaining -- because it can't store your functions as 64-bit floating point numbers.

The quick solution is to let numpy know the data type you want. eg.

myArray = numpy.zeros(x, dtype=numpy.object)

Note that the data type cannot be any class, but must be an instance of numpy.dtype (for advanced use you can create additional dtypes a runtime that numpy can then manipulate). For functions, numpy will store them as numpy.object (which means any generic python object). I do not think you will get any performance benefit from using numpy to store arrays of functions. Perhaps you would be better off creating generator functions and chaining them, converting to a numpy array once you know the result will be a number.

funcs = [createFunction(i) for i in xrange(x)]

def getItemFromEachFunction(i):
    return funcs[i]()

arr = numpy.fromfunction(getItemFromEachFunction, (x,))