I have one list of 100 numbers as height for Y axis, and as length for X axis: 1 to 100 with a constant step of 5. I need to calculate the Area that it is included by the curve of the (x,y) points, and the X axis, using rectangles and Scipy. Do I have to find the function of this curve? or not? ... almost all the examples I have read are about a specific equation for the Y axis. In my case there is no equation, just data from a list. The classic solution is to add or the Y points and multiple by the step X distance... using Scipy any idea?
Please, can anyone recommend any book which focusing on numerical (finite elementary) methods, using Scipy and Numpy? ...
The numpy and scipy libraries include the composite trapezoidal (numpy.trapz) and Simpson's (scipy.integrate.simps) rules.
Here's a simple example. In both trapz
and simps
, the argument dx=5
indicates that the spacing of the data along the x axis is 5 units.
from __future__ import print_function
import numpy as np
from scipy.integrate import simps
from numpy import trapz
# The y values. A numpy array is used here,
# but a python list could also be used.
y = np.array([5, 20, 4, 18, 19, 18, 7, 4])
# Compute the area using the composite trapezoidal rule.
area = trapz(y, dx=5)
print("area =", area)
# Compute the area using the composite Simpson's rule.
area = simps(y, dx=5)
print("area =", area)
Output:
area = 452.5
area = 460.0
You can use Simpsons rule or the Trapezium rule to calculate the area under a graph given a table of y-values at a regular interval.
Python script that calculates Simpsons rule:
def integrate(y_vals, h):
i = 1
total = y_vals[0] + y_vals[-1]
for y in y_vals[1:-1]:
if i % 2 == 0:
total += 2 * y
else:
total += 4 * y
i += 1
return total * (h / 3.0)
h
is the offset (or gap) between y values, and y_vals
is an array of well, y values.
Example (In same file as above function):
y_values = [13, 45.3, 12, 1, 476, 0]
interval = 1.2
area = integrate(y_values, interval)
print("The area is", area)
If you have sklearn isntalled, a simple alternative is to use sklearn.metrics.auc
This computes the area under the curve using the trapezoidal rule given arbitrary x, and y array
import numpy as np
from sklearn.metrics import auc
dx = 5
xx = np.arange(1,100,dx)
yy = np.arange(1,100,dx)
print('computed AUC using sklearn.metrics.auc: {}'.format(auc(xx,yy)))
print('computed AUC using np.trapz: {}'.format(np.trapz(yy, dx = dx)))
both output the same area: 4607.5
the advantage of sklearn.metrics.auc is that it can accept arbitrarily-spaced 'x' array, just make sure it is ascending otherwise the results will be incorrect