I have some phone call data in a mssql 2008 database and would like to split it in to 15 (or X) minute intervals to be used in some Erlang calculations.
call logg:
call start end
1 2011-01-01 12:00:01 2011-01-01 12:16:00
2 2011-01-01 12:14:00 2011-01-01 12:17:30
3 2011-01-01 12:29:30 2011-01-01 12:46:20
Would be shown as
call start end
1 2011-01-01 12:00:01 2011-01-01 12:15:00
1 2011-01-01 12:15:00 2011-01-01 12:16:00
2 2011-01-01 12:14:00 2011-01-01 12:15:00
2 2011-01-01 12:15:00 2011-01-01 12:17:30
3 2011-01-01 12:29:30 2011-01-01 12:30:00
3 2011-01-01 12:30:00 2011-01-01 12:45:00
3 2011-01-01 12:45:00 2011-01-01 12:46:20
Does anyone have any good suggestions on how to do this?
Thanks in advance
Sample table
create table CallLogTable (call int, start datetime, [end] datetime)
insert CallLogTable select
1, '2011-01-01 12:00:01', '2011-01-01 12:16:00' union all select
2, '2011-01-01 12:14:00', '2011-01-01 12:17:30' union all select
3, '2011-01-01 12:29:30', '2011-01-01 12:46:20'
The query
select
call,
case when st < start then start else st end [start],
case when et > [end] then [end] else et end [end]
from (select *,
xstart = dateadd(mi, 15*(datediff(mi, 0, d.start)/15), 0),
blocks = datediff(mi, d.[start], d.[end])/15+2
from CallLogTable d) d
cross apply (
select
st = dateadd(mi,v.number*15,xstart),
et = dateadd(mi,v.number*15+15,xstart)
from master..spt_values v
where v.type='P' and v.number <= d.blocks
and d.[end] > dateadd(mi,v.number*15,xstart)) v
order by call, start
If creating a view off this query, drop the last [order by] line
Notes
- the expression (xstart)
dateadd(mi, 15*(datediff(mi, 0, d.start)/15), 0)
calculates the 15-minute border on which the call started
- blocks is pre-calculated as a quick cutoff so as not to process more rows than is necessary from spt_values
- cross apply allows each row in the prior table to be used in the subquery. the subquery builds every 15-minute block covering the period
- the case statements align the start and end time to the actuals, if those are inside the 15-minute borders
Richard is right, this query splits the calls into 15 minute intervals:
Try this:
With CallData ([call],start,[end]) as
(
select [call],start,case when [end]<=dateadd(minute,15,start) then [end] else dateadd(minute,15,start) end as [end] from CallLogTable
union all
select CallData.[call],CallData.[end],case when CallLogTable.[end]<=dateadd(minute,15,CallData.[end]) then CallLogTable.[end] else dateadd(minute,15,CallData.[end]) end as [end] from CallLogTable join CallData on CallLogTable.[call]=CallData.[call]
where CallData.[end]<case when CallLogTable.[end]<=dateadd(minute,15,CallData.[end]) then CallLogTable.[end] else dateadd(m,15,CallData.[end]) end
)
select * from CallData
Unfortunatelly I do not have a SQL at hand so I cannot test it. This is the idea however to make it so you will probably manage to adjust it in case it fails somewhere.
I put the aliases and the mistake was using m instead of minute. Can you try it to see if it works. TX. (that happens when not testing is done)
To split it at 15 minutes (00/15/30/45) u can use this:
With CallData ([call],start,[end]) as
(
select [call],start,case when [end]<=dateadd(minute,15*((datediff(minute,0,start)/15)+1),0) then [end] else dateadd(minute,15*((datediff(minute,0,start)/15)+1),0) end as [end] from CallLogTable
union all
select CallData.[call],CallData.[end],case when CallLogTable.[end]<=dateadd(minute,15*((datediff(minute,0,CallData.[End])/15)+1),0) then CallLogTable.[end] else dateadd(minute,15*((datediff(minute,0,CallData.[End])/15)+1),0) end as [end] from CallLogTable join CallData on CallLogTable.[call]=CallData.[call]
where CallData.[end]<case when CallLogTable.[end]<=dateadd(minute,15*((datediff(minute,0,CallData.[End])/15)+1),0) then CallLogTable.[end] else dateadd(minute,15*((datediff(minute,0,CallData.[End])/15)+1),0) end
)
select * from CallData order by [call],start
Fascinating problem!
Just for kicks, here's a PostgreSQL approach, using generate_sequence() to fill out the interior 15 minute intervals. There's undoubtedly a way to coalesce the first two unions that build the first and last intervals, but that is left as an exercise for the reader.
select
c.call
,c.dt_start - date_trunc('day', c.dt_start) as "begin"
,(date_trunc('second', (cast (c.dt_start - date_trunc('day', c.dt_start) as interval)
/ (15*60) + interval '1 second'))) * (15*60) as "end"
from
call c
where
(date_trunc('second', (cast (c.dt_start - date_trunc('day', c.dt_start) as interval)
/ (15*60) + interval '1 second'))) * (15*60)
<= date_trunc('second', (cast (c.dt_end - date_trunc('day', c.dt_end) as interval)
/ (15*60))) * (15*60)
union select
c.call
,greatest(
c.dt_start - date_trunc('day', c.dt_start),
date_trunc('second', (cast (c.dt_end - date_trunc('day', c.dt_end) as interval)
/ (15*60))) * (15*60)
) as "t_last_q"
,c.dt_end - date_trunc('day', c.dt_end) as "t_end"
from
call c
union select TQ.call, TQ.t_next_q, SEQ.SLICE
from
(select cast(g || ' seconds' as interval) as SLICE
from generate_series(0, 86400, 15*60) g) SEQ,
(select
c.call
,(date_trunc('second', (cast (c.dt_start - date_trunc('day', c.dt_start) as interval)
/ (15*60) + interval '1 second'))) * (15*60) as "t_next_q"
,date_trunc('second', (cast (c.dt_end - date_trunc('day', c.dt_end) as interval)
/ (15*60))) * (15*60) as "t_last_q"
from
call c
) TQ
where
SEQ.SLICE > TQ.t_next_q
and SEQ.SLICE <= TQ.t_last_q
There are only 1440 minutes in the day. So you could create and populate with 1440 rows a MINUTES table whose primary key is a four-digit number representing hour-minute in 24-hour format (e.g. 9:13PM would be 2113). Then you could have as many columns in that table as you needed to characterize any minute in the day: what quarter-hour it falls into, whether it is considered off-peak or peak, what its billing rate is under Plan A, and so on. You just keep adding columns as your use cases require. Totally extensible.
In your example, the first column MINUTES.QuarterHour would indicate which quarter-hour the minute fell into. 02:17 is in quarter-hour 6, for example. Once the table is populated, all you need to do then is use the HHMM chunk of your datetime value from your phone-calls table to pull-back the quarter-hour to which that time belongs, using a simple join on HHMMChunk = MINUTES.id. The advantages: queries are much simpler and easier to write and maintain and they are probably not as computationally intensive.
EDIT: the approach is also generic and portable (i.e. not implementation-specific).