-->

mxnet gradient descent for linear regression, vari

2020-05-08 07:50发布

问题:

I'm trying to implement a simple gradient descent for linear regression.

It works normally if I compute the gradient manually (by using the analytical expression), but now i was trying to implement it with autograd from the mxnet module.

This is the code


from mxnet import autograd, np, npx
npx.set_np()

def main():
    # learning algorithm parameters
    nr_epochs = 1000
    alpha = 0.01

    # read data, insert column of ones (to include bias with other parameters)
    data = pd.read_csv("dataset.txt", header=0, index_col=None, sep="\s+")
    data.insert(0, "x_0", 1, True)  # insert column of "1"s as x_0
    m = data.shape[0]  # number of samples
    n = data.shape[1] - 1  # number of features
    X = data.iloc[:, 0:n].values  # array with x values
    Y = data.iloc[:, -1].values  # array with y values

    theta = np.zeros(n)  # initial parameters array
    theta.attach_grad()

    theta, J = GradientDescent(X, Y, theta, alpha, nr_epochs)



#-------------------#
#   loss function   #
#-------------------#
def LossFunction(X, Y, theta):
    m = X.shape[0]                  # number of training samples
    loss = 0

    for i in range(X.shape[0]):
        loss = loss + (1 / (2 * m)) * (H(X[i, :], theta) - Y[i]) ** 2
    return loss


#----------------#
#   hypothesis   #
#----------------#
def H(x, theta):
    return np.dot(x, theta)


#----------------------#
#   gradient descent   #
#----------------------#
def GradientDescent(X, Y, theta, alpha, nr_epochs):

    m = X.shape[0]
    n = X.shape[1]
    grad = np.zeros(n)   

    Loss = np.zeros(nr_epochs)          


    for epoch in range(nr_epochs):
        with autograd.record():
            Loss[epoch] = LossFunction(X, Y, theta)

        Loss[epoch].backward()

        for j in range(n):
            theta[j] = theta[j] - alpha * theta.grad[j]
        return theta, Loss


if __name__ == "__main__":
    main()

The problem is that I get an error when the code is computing the hypothesis, in the dot product between X and theta

return np.dot(x, theta)

The error message says: Argument a must have NDArray type, but got [ 1. -5.05358]

So, I'm assuming there must be some incompatibility between the types of x and theta. I checked them, and I got:

X -> <class 'numpy.ndarray'>
theta -> <class 'mxnet.numpy.ndarray'>

theta is created with np.zeros, so it is a mxnet array, while X is converted from a dataset with the .values method...is this the source of the problem? Thanks!

回答1:

MXNet doesn't uses Numpy NDArray, but mxnet NDArray, which has very similar functionality and API but a different backend; mxnet NDArray is written in C++, uses asynchronous execution, is GPU-compatible and supports automatic differentiation. It also works on CPU, where it usually faster than default (OpenBLAS-backed) Numpy.

So to fix your error I recommend to make sure you don't use numpy in your code, but mxnet NDArray everywhere. It is actually very easy to change because the API is super similar to numpy. And if need be, you can convert to and from numpy, for example:

from mxnet import nd

# Assuming A is an numpy ndarray and B an mxnet ndarray

# from numpy to mxnet
mxnet_array = nd.array(A)


# from mxnet to numpy
np_array = B.asnumpy()

Regarding your specific interest in linear regression, see here 2 mxnet demos in python:

  • Linear regression in MXNet from scratch
  • Linear regression in MXNet with gluon (gluon is the name of the python imperative frontend, a bit like what keras is to TF)

Using those NDArrays is one of the reasons MXNet is so fast, because it makes your code fully asynchronous and lets the engine find optimizations. Those NDArrays is one of the things that make MXNet so awesome, try them and you'll love them :)