Alright, so I apologize ahead of time if I'm just asking something silly, but I really thought I understood how apply_along_axis
worked. I just ran into something that might be an edge case that I just didn't consider, but it's baffling me. In short, this is the code that is confusing me:
class Leaf(object):
def __init__(self, location):
self.location = location
def __len__(self):
return self.location.shape[0]
def bulk_leaves(child_array, axis=0):
test = np.array([Leaf(location) for location in child_array]) # This is what I want
check = np.apply_along_axis(Leaf, 0, child_array) # This returns an array of individual leafs with the same shape as child_array
return test, check
if __name__ == "__main__":
test, check = bulk_leaves(np.random.ran(100, 50))
test == check # False
I always feel silly using a list comprehension with numpy and then casting back to an array, but I'm just nor sure of another way to do this. Am I just missing something obvious?
The
apply_along_axis
is pure Python that you can look at and decode yourself. In this case it essentially does:In other words, it preallocates the container array, and then fills in the values with an iteration. That certainly is better than appending to the array, but rarely better than appending values to a list (which is what the comprehension is doing).
You could take the above template and adjust it to produce the array that you really want.
In quick tests this iteration times the same as the comprehension. The
apply_along_axis
, besides being wrong, is slower.The problem seems to be that
apply_along_axis
usesisscalar
to determine whether the returned object is a scalar, butisscalar
returnsFalse
for user-defined classes. The documentation forapply_along_axis
says:Since your class's
__len__
returns the length of the array it wraps, numpy "expands" the resulting array into the original shape. If you don't define a__len__
, you'll get an error, because numpy doesn't think user-defined types are scalars, so it will still try to calllen
on it.As far as I can see, there is no way to make this work with a user-defined class. You can return 1 from
__len__
, but then you'll still get an Nx1 2D result, not a 1D array of length N. I don't see any way to make Numpy see a user-defined instance as a scalar.There is a numpy bug about the
apply_along_axis
behavior, but surprisingly I can't find any discussion of the underlying issue thatisscalar
returns False for non-numpy objects. It may be that numpy just decided to punt and not guess whether user-defined types are vector or scalar. Still, it might be worth asking about this on the numpy list, as it seems odd to me that things likeisscalar(object())
return False.However, if as you say you don't care about performance anyway, it doesn't really matter. Just use your first way with the list comprehension, which already does what you want.