Allen's Interval Algebra operations in SQL

2020-05-23 20:01发布

问题:

I've been struggling to solve a few tricky problems in SQL where I need to infer asset utilisation from event intervals, and have just learned about Allen's Interval Algebra, which seems to be the key to solving these problems.

The algebra describes 13 kinds of relationships between intervals, and the image below shows the first seven, with the rest being the inverse (i.e. y before x, y meets x, etc)

But I'm having trouble finding out how to implement the relevant operations.

Given my sample data, how can I go about getting results from the following three types of operations in SQL or PLSQL?

  1. Disjoin
  2. Reduce
  3. Find Gaps

Please see my SQLFiddle link: http://sqlfiddle.com/#!4/cf0cc


Original Data

   start end width
[1]     1  12    12
[2]     8  13     6
[3]    14  19     6
[4]    15  29    15
[5]    19  24     6
[6]    34  35     2
[7]    40  46     7


Operation 1 - Disjoined Result

I'd like a query to return the disjoint set from the data above, where all overlapping intervals have been broken into rows such that no overlaps exist.

How do I go about this SQL?

     start end width
[1]      1   7     7
[2]      8  12     5
[3]     13  13     1
[4]     14  14     1
[5]     15  18     4
[6]     19  19     1
[7]     20  24     5
[8]     25  29     5
[9]     34  35     2
[10]    40  46     7


Operation 2 - Reduced Result

How do I go about reducing/flattening the intervals, such that they are:

  • not empty (i.e. they have a non-null width);
  • not overlapping;
  • ordered from left to right;
  • not even adjacent (i.e. there must be a non empty gap between 2 consecutive ranges)

For my example, this would look like:

    start end width
[1]     1  29    29
[2]    34  35     2
[3]    40  46     7


Operation 3 - Gap Result

Also, how would I find the gaps?

   start end width
[1]    30  33     4
[2]    36  39     4

回答1:

Here is a SQLFiddle demo First of all create temp tables to simplify queries though you can put these creation queries into final queries and do it without temp tables:

create table t as select * from
(
select null s ,"start"-1 as e  from data
union all
select "start" s,null e  from data
union all
select "end"+1 s ,null e  from data
union all
select null s ,"end" e  from data
) d where exists (select "start" 
                  from data where d.s between data."start" and data."end"
                               or d.e between data."start" and data."end"
                                );
--Operation 1 - Disjoined Result   
create table t1 as select s,e,e-s+1 width from
(
select distinct s,(select min(e) from t where t.e>=t1.s) e from t t1
) t2 where t2.s is not null and t2.e is not null;

--Operation 2 - Reduced Result
create table t2 as 
select s,e,e-s+1 width from
(
select s,(select min(d2.e) from t1 d2 where d2.s>=d.s and not exists
          (select s from t1 where t1.s=d2.e+1) ) e
from
t1 d where not exists(select s from t1 where t1.e=d.s-1) 
) t2;

--Temp table for Operation 3 - Gaps
create table t3 as 
select null s, s-1 e from t2
union all
select e+1 s, null e from t2;

Now here are queries:

--Operation 1 - Disjoined Result
select * from t1 order by s;

--Operation 2 - Reduced Result


select * from t2 order by s;

--Operation 3 - Gaps

select s,e,e-s+1 width 
from
(
select s,(select min(e) from t3 where t3.e>=d.s) e from t3 d
) t4 where s is not null and e is not null
order by s;