Fitting binned lognormal data in Python

I have a range of particle size distribution data arranged by percentage volume fraction, like so:;

size   %
6.68   0.05
9.92   1.15
etc.

I need to fit this data to a lognormal distribution, which I planned to do using python's stats.lognorm.fit function, but this seems to expect the input as an array of variates rather than binned data, judging by what I've read.

I was planning to use a for loop to iterate through the data and .extend each size entry to a placeholder array the required number of times to create an array with a list of variates that corresponds to the binned data.

This seems really ugly and inefficient though, and the kind of thing that there's probably an easy way to do. Is there a way to input binned data into the stats.lognorm.fit function?

标签： python curve-fitting curve

1条回答

Anthone

2楼-- · 2020-03-07 10:30

I guess one possible workaround is to manually fit a pdf to your bin data, assuming x values are the midpoint of each interval, and y values are the corresponding bin frequency. And then fit a curve based on x and y values using scipy.optimize.curve_fit. I think accuracy of the results will depend the number of bins you have. An example is shown below:

import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
import numpy as np

def pdf(x, mu, sigma):
    """pdf of lognormal distribution"""

    return (np.exp(-(np.log(x) - mu)**2 / (2 * sigma**2)) / (x * sigma * np.sqrt(2 * np.pi)))

mu, sigma = 3., 1.                              # actual parameter value

data = np.random.lognormal(mu, sigma, size=1000)       # data generation
h = plt.hist(data, bins=30, normed = True)

y = h[0]                                        # frequencies for each bin, this is y value to fit 
xs = h[1]                                       # boundaries for each bin
delta = xs[1] - xs[0]                           # width of bins
x = xs[:-1] + delta /                           # midpoints of bins, this is x value to fit

popt, pcov = curve_fit(pdf, x, y, p0=[1, 1])    # data fitting, popt contains the fitted parameters
print(popt)
# [ 3.13048122  1.01360758]                       fitting results

fig, ax = plt.subplots()
ax.hist(data, bins=30, normed=True, align='mid', label='Histogram')
xr = np.linspace(min(xs), max(xs), 10000)
yr = pdf(xr, mu, sigma)
yf = pdf(xr, *popt)
ax.plot(xr, yr, label="Actual")
ax.plot(xr, yf, linestyle = 'dashed', label="Fitted")
ax.legend()

0人赞添加讨论(0) 举报

Fitting binned lognormal data in Python

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间