I know how to create a histogram (just use "with boxes") in gnuplot if my .dat file already has properly binned data. Is there a way to take a list of numbers and have gnuplot provide a histogram based on ranges and bin sizes the user provides?
相关问题
- How to set a variable line width when plotting?
- How to create PNG images with more than 72dpi usin
- Very low p-values in Python Kolmogorov-Smirnov Goo
- numpy beginner: writing an array using numpy.savet
- Gnuplot: how to have some space between axes and p
相关文章
- histogram without vertical lines in Mathematica
- Get data points from a histogram in Python
- Matplotlib.pyplot.hist() very slow
- Gnuplot - Using replot with png terminal
- Plotting histograms in Python using pandas
- Gnuplot multiplot with one colorbox
- Most efficient histogram code in python
- Binning time series in R?
I have found this discussion extremely useful, but I have experienced some "rounding off" problems.
More precisely, using a binwidth of 0.05, I have noticed that, with the techniques presented here above, data points which read 0.1 and 0.15 fall in the same bin. This (obviously unwanted behaviour) is most likely due to the "floor" function.
Hereafter is my small contribution to try to circumvent this.
This recursive method is for x >=0; one could generalise this with more conditional statements to obtain something even more general.
Do you want to plot a graph like this one? yes? Then you can have a look at my blog article: http://gnuplot-surprising.blogspot.com/2011/09/statistic-analysis-and-histogram.html
Key lines from the code:
Be very careful: all of the answers on this page are implicitly taking the decision of where the binning starts - the left-hand edge of the left-most bin, if you like - out of the user's hands. If the user is combining any of these functions for binning data with his/her own decision about where binning starts (as is done on the blog which is linked to above) the functions above are all incorrect. With an arbitrary starting point for binning 'Min', the correct function is:
You can see why this is correct sequentially (it helps to draw a few bins and a point somewhere in one of them). Subtract Min from your data point to see how far into the binning range it is. Then divide by binwidth so that you're effectively working in units of 'bins'. Then 'floor' the result to go to the left-hand edge of that bin, add 0.5 to go to the middle of the bin, multiply by the width so that you're no longer working in units of bins but in an absolute scale again, then finally add back on the Min offset you subtracted at the start.
Consider this function in action:
e.g. the value 1.1 truly falls in the left bin:
Born2Smile's answer is only correct if the bin boundaries occur at (n+0.5)*binwidth (where n runs over integers). mas90's answer is only correct if the bin boundaries occur at n*binwidth.
As usual, Gnuplot is a fantastic tool for plotting sweet looking graphs and it can be made to perform all sorts of calculations. However, it is intended to plot data rather than to serve as a calculator and it is often easier to use an external programme (e.g. Octave) to do the more "complicated" calculations, save this data in a file, then use Gnuplot to produce the graph. For the above problem, check out the "hist" function is Octave using
[freq,bins]=hist(data)
, then plot this in Gnuplot usingWe do not need to use recursive method, it may be slow. My solution is using a user-defined function rint instesd of instrinsic function int or floor.
This function will give
rint(0.0003/0.0001)=3
, whileint(0.0003/0.0001)=floor(0.0003/0.0001)=2
.Why? Please look at Perl int function and padding zeros
yes, and its quick and simple though very hidden:
check out
help smooth freq
to see why the above makes a histogramto deal with ranges just set the xrange variable.