Numpy arrays vs lists for custom classes

2019-08-01 02:33发布

问题:

I understand why numpy arrays of "standard" types are almost always more efficient and than a list containing the same type of data. Therefore it is better to get into the habit of using numpy arrays for the simple stuff, yes?

However, I'd like to know what the pro's and con's are to using numpy arrays to 'store' custom class instances, compared to using lists for that.

Consider

import numpy as np

class Foo:
    def __init__(self, name):
        self.name = name

class Bar:
    def __init__(self, name):
        self.name = name
        self.myFoos = np.zeros(0, dtype = Foo)

    def add_foo(self, some_foo):
        self.myFoos = np.append(self.myFoos, some_foo)

I'd be able to do just fine using

self.myFoos = []

What should I keep in mind when making this decision?

Does the complexity of the Foo class make a big difference? (in my use case, it contains maybe 20 or 30 standard types, one or two fixed-sized integer arrays and then about 10 simple methods.)

Does the amount of Foos typically in the myFoos make a difference? (in my use case, it will be zero to 10)

Does the amount of times myFoos get handled make a difference? (In my real use case, it will be called maybe 10 to 20 times between user actions.)

P.s. Although the code works fine, pyCharm doesn't like that last append statement, it warns me that

Expected type 'Union[ndarray, iterable]' got 'Foo' instead.

Thanks in advance!

回答1:

I've discussed making arrays of custom objects before - I'll try to look up a good discussion.

But, first a couple of notes

np.zeros(0, dtype = Foo)

really is np.zeros(0, dtype=object). There are the standard dtypes, and there's object. This holds pointers to objects else where in memory. And like lists these can be anything - numbers, strings, lists, arrays, Foo(), None, etc. And can be changed.

Stay away from np.append. It isn't a good replacement for list append. If you must make a new array with other arrays, learn to use concatenate (that includes getting dimensions right).

self.myFoos = np.append(self.myFoos, some_foo)

Object dtype lists are basically the same as lists, except that they can be reshaped to 2d, and they can't grow with an append. Many operations on a object dtype array are performed with list comprehensions.


Replace elements in array with class instances

Pass output from class instance as input to another

I answered several questions from this poster (search around the same time frame) who was trying to use arrays of a custom class. Just accessing an attribute of such an object requires a list comprehension.