Histogram using gnuplot?

2019-01-01 14:35发布

I know how to create a histogram (just use "with boxes") in gnuplot if my .dat file already has properly binned data. Is there a way to take a list of numbers and have gnuplot provide a histogram based on ranges and bin sizes the user provides?

9条回答
看风景的人
2楼-- · 2019-01-01 14:55

With respect to binning functions, I didn't expect the result of the functions offered so far. Namely, if my binwidth is 0.001, these functions were centering the bins on 0.0005 points, whereas I feel it's more intuitive to have the bins centered on 0.001 boundaries.

In other words, I'd like to have

Bin 0.001 contain data from 0.0005 to 0.0014
Bin 0.002 contain data from 0.0015 to 0.0024
...

The binning function I came up with is

my_bin(x,width)     = width*(floor(x/width+0.5))

Here's a script to compare some of the offered bin functions to this one:

rint(x) = (x-int(x)>0.9999)?int(x)+1:int(x)
bin(x,width)        = width*rint(x/width) + width/2.0
binc(x,width)       = width*(int(x/width)+0.5)
mitar_bin(x,width)  = width*floor(x/width) + width/2.0
my_bin(x,width)     = width*(floor(x/width+0.5))

binwidth = 0.001

data_list = "-0.1386 -0.1383 -0.1375 -0.0015 -0.0005 0.0005 0.0015 0.1375 0.1383 0.1386"

my_line = sprintf("%7s  %7s  %7s  %7s  %7s","data","bin()","binc()","mitar()","my_bin()")
print my_line
do for [i in data_list] {
    iN = i + 0
    my_line = sprintf("%+.4f  %+.4f  %+.4f  %+.4f  %+.4f",iN,bin(iN,binwidth),binc(iN,binwidth),mitar_bin(iN,binwidth),my_bin(iN,binwidth))
    print my_line
}

and here's the output

   data    bin()   binc()  mitar()  my_bin()
-0.1386  -0.1375  -0.1375  -0.1385  -0.1390
-0.1383  -0.1375  -0.1375  -0.1385  -0.1380
-0.1375  -0.1365  -0.1365  -0.1375  -0.1380
-0.0015  -0.0005  -0.0005  -0.0015  -0.0010
-0.0005  +0.0005  +0.0005  -0.0005  +0.0000
+0.0005  +0.0005  +0.0005  +0.0005  +0.0010
+0.0015  +0.0015  +0.0015  +0.0015  +0.0020
+0.1375  +0.1375  +0.1375  +0.1375  +0.1380
+0.1383  +0.1385  +0.1385  +0.1385  +0.1380
+0.1386  +0.1385  +0.1385  +0.1385  +0.1390
查看更多
若你有天会懂
3楼-- · 2019-01-01 14:59

I have a couple corrections/additions to Born2Smile's very useful answer:

  1. Empty bins caused the box for the adjacent bin to incorrectly extend into its space; avoid this using set boxwidth binwidth
  2. In Born2Smile's version, bins are rendered as centered on their lower bound. Strictly they ought to extend from the lower bound to the upper bound. This can be corrected by modifying the bin function: bin(x,width)=width*floor(x/width) + width/2.0
查看更多
只若初见
4楼-- · 2019-01-01 15:00

I have a little modification to Born2Smile's solution.

I know that doesn't make much sense, but you may want it just in case. If your data is integer and you need a float bin size (maybe for comparison with another set of data, or plot density in finer grid), you will need to add a random number between 0 and 1 inside floor. Otherwise, there will be spikes due to round up error. floor(x/width+0.5) will not do because it will create pattern that's not true to original data.

binwidth=0.3
bin(x,width)=width*floor(x/width+rand(0))
查看更多
登录 后发表回答