My SQL is a bit rusty and I'm having quite a bit of difficulty with this problem. Suppose I have a table with a Timestamp column and a Number column. The goal is to return a result set containing the average value for some arbitrarily chosen regular interval.
So, for example, if I had the following initial data, the resulting output with a 5 minute interval would be as follows:
time value
------------------------------- -----
06-JUN-12 12.40.00.000000000 PM 2
06-JUN-12 12.41.35.000000000 PM 3
06-JUN-12 12.43.22.000000000 PM 4
06-JUN-12 12.47.55.000000000 PM 5
06-JUN-12 12.52.00.000000000 PM 2
06-JUN-12 12.54.59.000000000 PM 3
06-JUN-12 12.56.01.000000000 PM 4
OUTPUT:
start_time avg_value
------------------------------- ---------
06-JUN-12 12.40.00.000000000 PM 3
06-JUN-12 12.45.00.000000000 PM 5
06-JUN-12 12.50.00.000000000 PM 2.5
06-JUN-12 12.55.00.000000000 PM 4
Note that this is an Oracle database, so Oracle-specific solutions would work fine. This could, of course, be done with a stored procedure but I was hoping to accomplish the task in a single query.
Justin's and Sebas' answers can be extended with a LEFT JOIN to eliminate "gaps", which is often desirable.
If that's not necessary, as an alternative, we can go old school Oracle DATE arithmetic...
Let's unpack that a bit. We can separate the date and time components, using TRUNC to get the date portion, and using a TO_CHAR to return the number of seconds since midnight. We know 5 minutes is 300 seconds, and we know there are 86400 seconds in a day. So we can divide the number of seconds by 300, and take the FLOOR of that (just the integer portion), which rounds us down to the nearest 5 minute boundary. We multiply that back (by 300), to get seconds again, and then divide that by the number of seconds in a day (86400), and we can add that back to the (truncated) date portion.
Painful, yes. But blazingly fast.
NOTE: this returns the rounded time value as a
DATE
, this could be cast back to a timestamp if needed, but for even 5 minute boundaries, aDATE
has sufficient resolution.As a benefit of this approach, for a large table, we can boost performance of the query by adding a covering index for this query:
ADDENDUM:
MiMo provided an answer for SQL Server, suggesting that it would be adaptable for Oracle. Here is an adaptation of that approach in Oracle. Note that Oracle does not provide equivalents for the DATEDIFF and DATEADD functions. Oracle uses simple arithmetic instead.
The choice of Jan 1, 0001 A.D. as a base date is arbitrary, but I didn't want to mess with negative values, and figuring out if FLOOR would be right, or whether we would need to use CEIL with negative numbers. (The magic number 288 is a result of 1440 minutes in a day divided by 5). In this case, we are taking the fractional day, multiplying by 1440 and dividing by 5, and taking integer portion of that, and then putting it back to fractional days.
It is tempting to pull that "base date" from a PL/SQL package, or get it from a subquery, but doing either of those might prevent this expression from being deterministic. And we'd really like to keep open the option of creating a function based index.
My preference is to avoid the need for including a "base date" in the calculation.
This is a solution for SQL Server:
The start date is arbitrary. The idea is that you compute the number of minutes from the start date, and then group by this number divided by the interval.
It should be adaptable to Oracle easily using the equivalent for
DATEADD
andDATEDIFF
fiddle: http://sqlfiddle.com/#!4/9e314/11
edit: beated by Justin, as usual... :-)
Something like
should work. Rather than generating 10 intervals and hard-coding the lowest interval, you can enhance the query to derive the starting point and the number of rows from the
MIN(time)
andMAX(time)
in the table.