I have a range of particle size distribution data arranged by percentage volume fraction, like so:;
size %
6.68 0.05
9.92 1.15
etc.
I need to fit this data to a lognormal distribution, which I planned to do using python's stats.lognorm.fit
function, but this seems to expect the input as an array of variates rather than binned data, judging by what I've read.
I was planning to use a for loop to iterate through the data and .extend
each size entry to a placeholder array the required number of times to create an array with a list of variates that corresponds to the binned data.
This seems really ugly and inefficient though, and the kind of thing that there's probably an easy way to do. Is there a way to input binned data into the stats.lognorm.fit
function?
I guess one possible workaround is to manually fit a pdf to your bin data, assuming x values are the midpoint of each interval, and y values are the corresponding bin frequency. And then fit a curve based on x and y values using
scipy.optimize.curve_fit
. I think accuracy of the results will depend the number of bins you have. An example is shown below: