How to make numpy overloading of __add__ independe

2020-03-24 03:34发布

I am facing an issue when overloading operators in a class containing a numpy array as attribute. Depending on the order of the operands, the result type will be my class A (desired behavior) or a numpy array. How to make it always return an instance of A?

Example:

import numpy as np

class A(object):
    """ class overloading a numpy array for addition
    """
    def __init__(self, values):
        self.values = values

    def __add__(self, x):
        """ addition
        """
        x = np.array(x) # make sure input is numpy compatible
        return A(self.values + x)

    def __radd__(self, x):
        """ reversed-order (LHS <-> RHS) addition
        """
        x = np.array(x) # make sure input is numpy compatible
        return A(x + self.values)

    def __array__(self):
        """ so that numpy's array() returns values
        """
        return self.values

    def __repr__(self):
        return "A object: "+repr(self.values)

An instance of A:

>>> a = A(np.arange(5))

This works as expected:

>>> a + np.ones(5)  
A object: array([ 1.,  2.,  3.,  4.,  5.])

This does not:

>>> np.ones(5) + a
array([ 1.,  2.,  3.,  4.,  5.])

Even though this is fine:

>>> list(np.ones(5)) + a
A object: array([ 1.,  2.,  3.,  4.,  5.])

What happens in the second example is that radd is not called at all, and instead the numpy method __add__ from np.ones(5) is called.

I tried a few suggestions from this post but __array_priority__ does not seem to make any difference (EDIT after seberg comment: at least in numpy 1.7.1, but could work on newer versions), and __set_numeric_ops__ leads to Segmentation Fault... I guess I am doing something wrong.

Any suggestion that works on the simple example above (while keeping __array__ attribute)?

EDIT: I do not want A to be a subclass of np.ndarray, since this would comes with other complications that I want to avoid - for now at least. Note that pandas seems to have got around this problem:

import pandas as pd
df = pd.DataFrame(np.arange(5)) 
type(df.values + df) is pd.DataFrame  # returns True
isinstance(df, np.ndarray) # returns False

I'd be curious to know how this was done.

SOLUTION: in addition to M4rtini solution of subclassing, it is possible to add __array_wrap__ attribute to the class A (to avoid subclassing). More here. According to seberg, __array_priority__ could also work on newer numpy versions (see comment).

3条回答
对你真心纯属浪费
2楼-- · 2020-03-24 03:36

Make A a subclass of np.ndarray and Python will invoke your A.__radd__ method first.

From the object.__radd__ documentation:

Note: If the right operand’s type is a subclass of the left operand’s type and that subclass provides the reflected method for the operation, this method will be called before the left operand’s non-reflected method. This behavior allows subclasses to override their ancestors’ operations.

By subclassing your A object is indeed able to intercept the addition:

>>> import numpy as np
>>> class A(np.ndarray):
...     """ class overloading a numpy array for addition
...     """
...     def __init__(self, values):
...         self.values = values
...     def __add__(self, x):
...         """ addition
...         """
...         x = np.array(x) # make sure input is numpy compatible
...         return A(self.values + x)
...     def __radd__(self, x):
...         """ reversed-order (LHS <-> RHS) addition
...         """
...         x = np.array(x) # make sure input is numpy compatible
...         return A(x + self.values)
...     def __array__(self):
...         """ so that numpy's array() returns values
...         """
...         return self.values
...     def __repr__(self):
...         return "A object: "+repr(self.values)
... 
>>> a = A(np.arange(5))
>>> a + np.ones(5)  
A object: array([ 1.,  2.,  3.,  4.,  5.])
>>> np.ones(5) + a
A object: array([ 1.,  2.,  3.,  4.,  5.])

Do study the Subclassing ndarray documenation for caveats and implications.

查看更多
做自己的国王
3楼-- · 2020-03-24 03:38

@Martijn Pieters does not seem to work as there are special rules for subclassing an nparray (see here) including using __new__ instead of __init__ and using __array_finalize__.

Here is the code that works for me:

import numpy as np

class Abstract_Array(np.ndarray):
    """ class overloading a numpy array for addition
    """
    def __new__(cls, input_array):
        obj = np.asarray(input_array).view(cls)
        return obj

    def __array_finalize__(self, obj):
        return None

    def __add__(self, x):
        """ addition
        """
        x = np.array(x) # make sure input is numpy compatible
        return Abstract_Array(addfunc(self,x)) # define your own add function

    def __radd__(self, x):
        """ reversed-order (LHS <-> RHS) addition
        """
        x = np.array(x) # make sure input is numpy compatible
        return Abstract_Array(raddfunc(self,x))

    def __array__(self):
        """ so that numpy's array() returns values
        """
        return self

    def __repr__(self):
        return "Abstract_Array object of shape %s: \n %s" % (str(self.shape), str(self)[:100])
查看更多
倾城 Initia
4楼-- · 2020-03-24 03:48

Thanks to @M4rtini and @seberg, it seems that adding __array_wrap__ does solve the question:

class A(object):
    ...
    def __array_wrap__(self, result):
        return A(result)  # can add other attributes of self as constructor

It appears to be called at the end of any ufunc operation (it includes array addition). This is also how pandas does it (in 0.12.0, pandas/core/frame.py l. 6020).

查看更多
登录 后发表回答