Why sympy lambdify function cannot identify numpy

2020-05-03 13:14发布

问题:

I want to use sympy and numpy to learning machine learning. Because symoy provides very convenient partial derivative calculation. But in the process of use, I found that sympy lambdify function and can't identify the numpy sum function and multiply function.


Take the following example

y_ = np.sum(np.dot(w,x)+b)
print(y_)
y_f = lambdify((w,x,b),y_,"numpy")
w_l = np.mat([1,1,1,1,1])
x_l= np.mat([1,1,1,1,1]).T
b_l = np.mat([0,0,0,0,0]).T
y_l = np.mat([6,6,6,6,6]).T
print(y_f(w_l,x_l,b_l))
b + w*x
[[5]
 [5]
 [5]
 [5]
 [5]]

Process finished with exit code 0
y_ = np.multiply(w,x)+b
print(y_)
y_f = lambdify((w,x,b),y_,"numpy")
w_l = np.mat([1,1,1,1,1]).T
x_l= np.mat([1,1,1,1,1]).T
b_l = np.mat([0,0,0,0,0]).T
y_l = np.mat([6,6,6,6,6]).T
print(y_f(w_l,x_l,b_l))
b + w*x
Traceback (most recent call last):
  File "G:/lijie/PycharmProjects/hw3/test.py", line 24, in <module>
    print(y_f(w_l,x_l,b_l))
  File "<lambdifygenerated-1>", line 2, in _lambdifygenerated
  File "C:\Users\lijie\AppData\Local\Programs\Python\Python36\lib\site-packages\numpy\matrixlib\defmatrix.py", line 220, in __mul__
    return N.dot(self, asmatrix(other))
ValueError: shapes (5,1) and (5,1) not aligned: 1 (dim 1) != 5 (dim 0)
b + w*x
Traceback (most recent call last):
  File "G:/lijie/PycharmProjects/hw3/test.py", line 24, in <module>
    print(y_f(w_l,x_l,b_l))
  File "<lambdifygenerated-1>", line 2, in _lambdifygenerated
  File "C:\Users\lijie\AppData\Local\Programs\Python\Python36\lib\site-packages\numpy\matrixlib\defmatrix.py", line 220, in __mul__
    return N.dot(self, asmatrix(other))
ValueError: shapes (5,1) and (5,1) not aligned: 1 (dim 1) != 5 (dim 0)

As you can see,lambdify simply accepts lamda expressions without checking the operation notation. How to solve this problem. Thank you for your help

回答1:

Mixing numpy and sympy can be tricky; add to that the potential confusions caused by np.mat instead of the base array type, ndarray.

In sum

y_ = np.sum(np.dot(w,x)+b)

evaluates a python/numpy expression on sympy objects. The result is a sympy expression w*x+b. The sympy objects are scalars, so this doesn't encode any sort of matrix multiplication, or array summation. The multiply expression evaluates the same way.

The lambdify expressions then translate the same y_ to the same Python function. And that evaluation depends on the dimensions and class of the np.mat arguments.

details

Ignoring the sympy part for now:

In [310]: w = np.mat([1,1,1,1,1]) 
     ...: x= np.mat([1,1,1,1,1]).T 
     ...: b = np.mat([0,0,0,0,0]).T 
     ...: y = np.mat([6,6,6,6,6]).T                                             
In [311]: np.sum(np.dot(w,x)+b)                                                 
Out[311]: 25
In [312]: np.multiply(w,x)+b                                                    
Out[312]: 
matrix([[1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1]])

Because they are np.mat, both w and x are 2d:

In [316]: w.shape                                                               
Out[316]: (1, 5)
In [317]: x.shape                                                               
Out[317]: (5, 1)

np.dot of (1,5) with (5,1) is a (1,1) result:

In [313]: np.dot(w,x)                                                           
Out[313]: matrix([[5]])

and for np.matrix, * is defined as the dot:

In [314]: w*x                                                                   
Out[314]: matrix([[5]])

Elementwise:

In [315]: np.multiply(w,x)         # elementwise produces (5,5)                                   
Out[315]: 
matrix([[1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1]])

np.sum(np.dot(w,x)+b) does the dot, then adds b, and ends with a sum over all elements.

np.multiply(w,x)+b does this multiply, adds b. There's no sum.

correction

Using the w.T that I missed the first time:

In [322]: np.multiply(w.T,x)                                                    
Out[322]: 
matrix([[1],
        [1],
        [1],
        [1],
        [1]])
In [323]: w.T*x                                                                 
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-323-11ad839cfa88> in <module>
----> 1 w.T*x

/usr/local/lib/python3.6/dist-packages/numpy/matrixlib/defmatrix.py in __mul__(self, other)
    218         if isinstance(other, (N.ndarray, list, tuple)) :
    219             # This promotes 1-D vectors to row vectors
--> 220             return N.dot(self, asmatrix(other))
    221         if isscalar(other) or not hasattr(other, '__rmul__') :
    222             return N.dot(self, other)

<__array_function__ internals> in dot(*args, **kwargs)

ValueError: shapes (5,1) and (5,1) not aligned: 1 (dim 1) != 5 (dim 0)

np.multiply of (5,1) and (5,1) produces (5,1), element wise multiplication

w.T*x is matrix multiplication for np.mat, hence the np.dot error.

The use of np.mat is discouraged (if not formally depricated). In numpy the addition of matmul/@ eliminates its notational advantages. Life is simpler in numpy if you stick with the base array class, ndarray. I realize that sympy still uses a 2d matrix concept, with * as matrix multiplication.

with sympy

In a isympy session, I find that I need to define w,x,b as symbols:

y_ = np.sum(np.dot(w,x)+b)

If w,x,b are just Symbols, they are scalars, not matrices or arrays. Your np.sum(np.dot(1,2)+4), np.multiply(1,2)+4 and 1*2+4 all produce the same thing. It's only when the variables are arrays, or np.mat, or maybe sympy.Matrix that the expressions are different.

The problem isn't with lambdify. In both cases it is given the same y_ (as verified by the print(y_). You get the error because the arguments are np.mat, and * is matrix multiplication.

With x,y,z symbols:

In [55]: f = lambdify((x,y,z),x*y+z, 'numpy')                                   

Using isympy introspection:

In [56]: f??                                                                    
Signature: f(x, y, z)
Docstring:
Created with lambdify. Signature:

func(x, y, z)

Expression:

x*y + z

Source code:

def _lambdifygenerated(x, y, z):
    return (x*y + z)


Imported modules:
Source:   
def _lambdifygenerated(x, y, z):
    return (x*y + z)
File:      ~/mypy/<lambdifygenerated-4>
Type:      function

Read the full documentation for lambdify. Note that it is basically a lexical substitution

https://docs.sympy.org/latest/modules/utilities/lambdify.html

This documentation warns:

As a general rule, NumPy functions do not know how to operate on SymPy expressions, and SymPy functions do not know how to operate on NumPy arrays. This is why lambdify exists: to provide a bridge between SymPy and NumPy.

sympify

https://docs.sympy.org/latest/modules/core.html#module-sympy.core.sympify

says it uses eval. With x,y,z defined as symbols:

In [66]: eval('np.dot(x,y)+z')                                                  
Out[66]: x⋅y + z

In [67]: eval('np.sum(np.dot(x,y)+z)')                                          
Out[67]: x⋅y + z

In [68]: eval('np.multiply(x,y)+z')                                             
Out[68]: x⋅y + z

In other words, it just passes the symbols to the numpy functions (and/or operators),

In [69]: np.dot(x,y)                                                            
Out[69]: x⋅y

dot turns its inputs into arrays:

In [70]: np.array(x)                                                            
Out[70]: array(x, dtype=object)

In [71]: np.dot(np.array(x), np.array(y))                                       
Out[71]: x⋅y

This works because symbols have '*' and '+' defined.

sympy docs warn that evaluating numpy does not 'know' anything about sympy objects. It treats them as object dtype arrays, which may or might not work:

In [72]: sin(x)       # sympy sin                                                          
Out[72]: sin(x)

In [73]: np.sin(x)        # numpy sin                                                      
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
AttributeError: 'Symbol' object has no attribute 'sin'

The above exception was the direct cause of the following exception:

TypeError                                 Traceback (most recent call last)
<ipython-input-73-92f2c2d0df9d> in <module>
----> 1 np.sin(x)

TypeError: loop of ufunc does not support argument 0 of type Symbol which has no callable sin method

The np.sin does np.sin(np.array(x)) and then delegates the action to a sin method of x - which does not exist.