I am trying to find the fastest and most efficient way to calculate slopes using Numpy and Scipy. I have a data set of three Y variables and one X variable and I need to calculate their individual slopes. For example, I can easily do this one row at a time, as shown below, but I was hoping there was a more efficient way of doing this. I also don't think linregress is the best way to go because I don't need any of the auxiliary variables like intercept, standard error, etc in my results. Any help is greatly appreciated.
import numpy as np
from scipy import stats
Y = [[ 2.62710000e+11 3.14454000e+11 3.63609000e+11 4.03196000e+11
4.21725000e+11 2.86698000e+11 3.32909000e+11 4.01480000e+11
4.21215000e+11 4.81202000e+11]
[ 3.11612352e+03 3.65968334e+03 4.15442691e+03 4.52470938e+03
4.65011423e+03 3.10707392e+03 3.54692896e+03 4.20656404e+03
4.34233412e+03 4.88462501e+03]
[ 2.21536396e+01 2.59098311e+01 2.97401268e+01 3.04784552e+01
3.13667639e+01 2.76377113e+01 3.27846013e+01 3.73223417e+01
3.51249997e+01 4.42563658e+01]]
X = [ 1990. 1991. 1992. 1993. 1994. 1995. 1996. 1997. 1998. 1999.]
slope_0, intercept, r_value, p_value, std_err = stats.linregress(X, Y[0,:])
slope_1, intercept, r_value, p_value, std_err = stats.linregress(X, Y[1,:])
slope_2, intercept, r_value, p_value, std_err = stats.linregress(X, Y[2,:])
slope_0 = slope/Y[0,:][0]
slope_1 = slope/Y[1,:][0]
slope_2 = slope/Y[2,:][0]
b, a = polyfit(X, Y[1,:], 1)
slope_1_a = b/Y[1,:][0]
I built upon the other answers and the original regression formula to build a function which works for any tensor. It will calculate the slopes of the data along the given axis. So, if you have arbitrary tensors
X[i,j,k,l], Y[i,j,k,l]
and you want to know the slopes for all other axes along the data in the third axis, you can call it withcalcSlopes( X, Y, axis = 2 )
.It also has the gimmick to work with only equally spaced y data being given. So for example:
Output:
The fastest and the most efficient way would be to use a native scipy function from linregress which calculates everything:
And here is an example:
will return you:
P.S. Just a mathematical formula for slope:
With X and Y defined the same way as in your question, you can use:
numpy.roll() helps you align the next observation with the current one, you just need to remove the last column which is the not useful difference between the last and first observations. Then you can calculate all slopes at once, without scipy.
In your example,
dX
is always 1, so you can save more time by computingslopes = dY
.A representation that's simpler than the accepted answer:
The equation for the slope comes from Vector notation for the slope of a line using simple regression.
As said before, you can use scipy's linregress. Here is how to get just the slope out:
Keep in mind that doing it this way, since you are computing extra values like r_value and p_value, will take longer than calculating only the slope manually. However, Linregress is pretty quick.
Source:https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.linregress.html
The way I did it is using the np.diff() function:
dx = np.diff(xvals),
dy = np.diff(yvals)
slopes = dy/dx