I am trying to return data in fifteen minute intervals. The first thing I thought to do was this:
select * from myTable where DATEPART(minute, Timestamp) % 15 = 0
But there are two problems with this approach. The first is that there will not necessarily always be data with a timestamp at a given minute, the other is that sometimes there are multiple data points at a given minute with different second values. I want to have exactly one row for each fifteen minute group, at :00, :15, :30, etc.
This data is only recorded when something changes, so if I don't have a data point at 12:30, for example, I could take the closest data point before that and use that value for 12:30 and it would be correct.
So basically I need to be able to return timestamps at exactly :00, :30, etc along with the data from the record closest to that time.
The data could span years but is more likely to be a shorter amount of time, days or weeks. This is what the expected output would look like:
Timestamp Value
1/1/2015 12:30:00 25
1/1/2015 12:45:00 41
1/1/2015 1:00:00 45
I'm having trouble thinking of a way to do this in SQL. Is it possible?
Like Shnugo mention, you can use a tally table to get your data in an interval of 15 minutes, something like this.
I am creating a dynamic tally table using CTE however you can even use a physical calendar table as per your needs.
OUTPUT
You can use next query to grouping data by 15 min intervals:
With account of date and hour:
Given a fixed start time, all you would need is a table of numbers to add your intervals to. If you don't already have a table of numbers (which are useful) then a quick way to generate one on the fly is
This simply generates a sequence from 1 to 10,000. For more reading on this see the following series:
Then once you have your numbers you can generate your intervals:
Which gives something like:
Then you just need to find the closest value on or before each interval using
APPLY
andTOP
:'Edit
One point to note is that in the case of having two equal timestamps that are both on or closest to an interval, I have applied a secondary level of ordering by
Value
:This is arbitrary and could be anything you chose, it would be advisable to ensure that you order by enough items to ensure the results are deterministic, that is to say, if you ran the query on the same data many times the same results would be returned because there is only one row that satisfies the criteria. If you had two rows like this:
If you just order by timestamp, for the interval
2015-07-14 14:00
, you don't know whether you will get a value of 50 or 100, and it could be different between executions depending on statistics and the execution plan. Similarly if you order byTimestamp
andValue
, then you don't know whetherField1
will be 1 or 2.Now you really have enough ways to create your tally table :-)
Very tricky, but something along these lines may work:
Of course this will not work if there are two readings with the exact same timestamp, in that case you need yet another group by. Messy querying no matter what.
The question is missing original data and schema information, so I'll address the question mainly in general form.
You're looking for results in a range that won't have any missing records, covering data that can have missing records. Given that requirement, the normal solution is to create a projection for just the values you need on the left hand side, using a source like a Numbers table that has nothing to do with your actual data. The Numbers table will be guaranteed not to be missing any records in your range. For date projections, you just add the appropriate number of days or minutes to your starting value, for the number of records you expect in the results.
Once you have the projection, you make an OUTER JOIN from the projection against your actual data. In this case, the JOIN is complicated by the fact that you have some date values extra records. I know of two ways to address this problem. One way is to GROUP BY the values in the projection. The other is to use an
OUTER APPLY
instead of a join. With an OUTER APPLY, you can just use TOP 1 filter on the applied query to limit results to one item.In summary, here is some psuedo-code that should help you get to where you need to be: