I am facing an issue when overloading operators in a class containing a numpy array as attribute. Depending on the order of the operands, the result type will be my class A (desired behavior) or a numpy array. How to make it always return an instance of A?
Example:
import numpy as np
class A(object):
""" class overloading a numpy array for addition
"""
def __init__(self, values):
self.values = values
def __add__(self, x):
""" addition
"""
x = np.array(x) # make sure input is numpy compatible
return A(self.values + x)
def __radd__(self, x):
""" reversed-order (LHS <-> RHS) addition
"""
x = np.array(x) # make sure input is numpy compatible
return A(x + self.values)
def __array__(self):
""" so that numpy's array() returns values
"""
return self.values
def __repr__(self):
return "A object: "+repr(self.values)
An instance of A:
>>> a = A(np.arange(5))
This works as expected:
>>> a + np.ones(5)
A object: array([ 1., 2., 3., 4., 5.])
This does not:
>>> np.ones(5) + a
array([ 1., 2., 3., 4., 5.])
Even though this is fine:
>>> list(np.ones(5)) + a
A object: array([ 1., 2., 3., 4., 5.])
What happens in the second example is that radd is not called at all, and instead the numpy method __add__
from np.ones(5) is called.
I tried a few suggestions from this post but __array_priority__
does not seem to make any difference (EDIT after seberg comment: at least in numpy 1.7.1, but could work on newer versions), and __set_numeric_ops__
leads to Segmentation Fault... I guess I am doing something wrong.
Any suggestion that works on the simple example above (while keeping __array__
attribute)?
EDIT: I do not want A to be a subclass of np.ndarray, since this would comes with other complications that I want to avoid - for now at least. Note that pandas seems to have got around this problem:
import pandas as pd
df = pd.DataFrame(np.arange(5))
type(df.values + df) is pd.DataFrame # returns True
isinstance(df, np.ndarray) # returns False
I'd be curious to know how this was done.
SOLUTION: in addition to M4rtini solution of subclassing, it is possible to add __array_wrap__
attribute to the class A (to avoid subclassing). More here. According to seberg, __array_priority__
could also work on newer numpy versions (see comment).
Make
A
a subclass ofnp.ndarray
and Python will invoke yourA.__radd__
method first.From the
object.__radd__
documentation:By subclassing your
A
object is indeed able to intercept the addition:Do study the Subclassing
ndarray
documenation for caveats and implications.@Martijn Pieters does not seem to work as there are special rules for subclassing an nparray (see here) including using
__new__
instead of__init__
and using__array_finalize__
.Here is the code that works for me:
Thanks to @M4rtini and @seberg, it seems that adding
__array_wrap__
does solve the question:It appears to be called at the end of any ufunc operation (it includes array addition). This is also how pandas does it (in 0.12.0, pandas/core/frame.py l. 6020).