Python: Histogram with area normalized to somethin

2019-02-09 11:11发布

问题:

Is there a way to tell matplotlib to "normalize" a histogram such that its area equals a specified value (other than 1)?

The option "normed = 0" in

n, bins, patches = plt.hist(x, 50, normed=0, histtype='stepfilled')

just brings it back to a frequency distribution.

回答1:

Just calculate it and normalize it to any value you'd like, then use bar to plot the histogram.

On a side note, this will normalize things such that the area of all the bars is normed_value. The raw sum will not be normed_value (though it's easy to have that be the case, if you'd like).

E.g.

import numpy as np
import matplotlib.pyplot as plt

x = np.random.random(100)
normed_value = 2

hist, bins = np.histogram(x, bins=20, density=True)
widths = np.diff(bins)
hist *= normed_value

plt.bar(bins[:-1], hist, widths)
plt.show()

So, in this case, if we were to integrate (sum the height multiplied by the width) the bins, we'd get 2.0 instead of 1.0. (i.e. (hist * widths).sum() will yield 2.0)



回答2:

You can pass a weights argument to hist instead of using normed. For example, if your bins cover the interval [minval, maxval], you have n bins, and you want to normalize the area to A, then I think

weights = np.empty_like(x)
weights.fill(A * n / (maxval-minval) / x.size)
plt.hist(x, bins=n, range=(minval, maxval), weights=weights)

should do the trick.

EDIT: The weights argument must be the same size as x, and its effect is to make each value in x contribute the corresponding value in weights towards the bin count, instead of 1.

I think the hist function could probably do with a greater ability to control normalization, though. For example, I think as it stands, values outside the binned range are ignored when normalizing, which isn't generally what you want.