I spent more than two months with RRDTOOL to find out how to store and visualize data on graph. I'm very close now to my goal, but for some reason I don't understand why it is happening that some data are considered to be NaN in my case.
I counting lines in gigabytes sized of log files and have feeding the result to an rrd database to visualize events occurrence. The stepping of the database is 60 seconds, the data is inserted in seconds base whenever it is available, so no guarantee the the next timestamp will be withing the heartbeat or within the stepping. Sometimes no data for minutes.
If have such big distance mostly my data is considered to be NaN.
b1_5D.rrd 1420068436:1 1420069461:1 1420073558:1 1420074583:1 1420076632:1 1420077656:1 1420079707:1 1420080732:1 1420082782:1 1420083807:1 1420086881:1 1420087907:1 1420089959:1 1420090983:1 1420094055:1 1420095080:1 1420097132:1 1420098158:1 1420103284:1 1420104308:1 1420107380:1 1420108403:1 1420117622:1 1420118646:1 1420121717:1 1420122743:1 1420124792:1 1420125815:1 1420131960:1 1420134007:1 1420147326:1 1420148352:1
rrdtool create b1_4A.rrd --start 1420066799 --step 60 DS:Value:GAUGE:120:0:U RRA:AVERAGE:0.5:1:1440 RRA:AVERAGE:0.5:10:1008 RRA:AVERAGE:0.5:30:1440 RRA:AVERAGE:0.5:360:1460
The above gives me an empty graph for the input above. If I extend the heart beat, than it will fill the time gaps with the same data. I've tried to insert zero values, but that will average out the counts and bring results in mils.
Maybe I taking something wrong regarding RRDTool. It would be great if someone could explain what I doing wrong.
Thank you.
It sounds as if your data - which is event-based at irregular timings - is not suitable for an RRD structure. RRD prefers to have its data at constant, regular intervals, and will coerce the incoming data to match its requirements.
Your RRD is defined to have a 60s step, and a 120s heartbeat. This means that it expects one sample every 60s, and no further apart than 120s.
Your DS is a gauge, and so the values you enter (all of them '1' in your example) will be the values stored, after any time normalisation.
If you increase the heartbeat, then a value received within this time will be used to make a linear approximation to fill in all samples since the last one. This is why doing so fills the gaps with the same data.
Since your step is 60s, the smallest sample time sidth will be 1 minute.
Since you are always storing '1's, your graph will therefore either show '1' (when the sample was received in the heartbeart window) or Unknown (when the heartbeat expired).
In other words, your graph is showing exactly what you gave it. You data are being coerced into a regular set of numerical values at a 1-minute step, each being 1 or Unknown.