Determine if a range is completely covered by a se

2019-05-23 12:36发布

问题:

How can I check if a range is completely covered by a set of ranges. In the following example:

WITH ranges(id, a, b) AS (
    SELECT 1,  0, 40 UNION
    SELECT 2, 40, 60 UNION
    SELECT 3, 80, 100 UNION
    SELECT 4, 10, 30
), tests(id, a, b) AS (
    SELECT 1, 10, 90 UNION
    SELECT 2, 10, 60
)
SELECT *
FROM tests
WHERE -- ?
  • I want to select 10, 60 because all of it is covered by 0, 40 and 40, 60 (and 10, 30)
  • I want to exclude 10, 90 because it is exposed between 60, 80

Assume that a is inclusive and b is exclusive i.e. the value 40 belongs to [40, 60) and not [0, 40). The ranges can contain gaps and all kind of overlaps.

The actual problem involves date+time data but dates are just numbers. I am using SQL server but generic solution is preferred.

回答1:

You want a recursive query finding the real ranges (0 to 60 and 80 to 100 in your case). We'd start with the ranges given and look for ranges extending these. At last we stick with the most extended ranges (e.g. the range 10 to 30 can be extended to 0 to 40 and then to 0 to 60, so we keep the widest range 0 to 60).

with wider_ranges(a, b, grp) as
(
  select a, b, id from ranges
  union all
  select
    case when r1.a < r2.a then r1.a else r2.a end,
    case when r1.b > r2.b then r1.b else r2.b end,
    r1.grp
  from wider_ranges r1
  join ranges r2 on (r2.a < r1.a and r2.b >= r1.a)
                 or (r2.b > r1.b and r2.a <= r1.b)
)
, real_ranges(a, b) as
(
  select distinct min(a), max(b)
  from wider_ranges
  group by grp
)
select * 
from tests
where exists
(
  select *
  from real_ranges
  where tests.a >= real_ranges.a and tests.b <= real_ranges.b
);

Rextester demo: http://rextester.com/BDJA16583

As requested this works in SQL Server, but is standard SQL, so it should work in about every DBMS featuring recursive queries.



回答2:

This is a recursive solution similar to Thorsten's. Just providing another example to have.

WITH ranges(id, a, b) AS (
    SELECT 1,  0, 40 UNION
    SELECT 2, 40, 60 UNION
    SELECT 3, 80, 100 UNION
    SELECT 4, 10, 30 
), tests(id, a, b) AS
(   
        SELECT 1 as id, 10 as a, 90 as b
        UNION
        SELECT 2, 10, 60
), rangeFinder(a, b, ra, rfb) AS
(
    SELECT a, b, 0 AS ra, 0 AS rfb 
    FROM ranges AS r
    UNION ALL
    SELECT rangeFinder.a, ranges.b, ranges.a, rangeFinder.b 
    FROM ranges 
    JOIN rangeFinder
        ON ranges.b > rangeFinder.b
        AND ranges.a <=rangeFinder.b
), islands(a, b) AS
(
    SELECT a, b 
    FROM rangeFinder
    WHERE a NOT IN (SELECT ra FROM rangeFinder)
        AND b NOT IN (SELECT rfb FROM rangeFinder)
)
SELECT t.id, t.a, t.b FROM 
tests t
JOIN islands i
ON t.a >= i.a
AND t.b <= i.b

Demo here: http://rextester.com/HDQ52126



回答3:

This is a general form of the solution. The idea is to do the following:

  • Get a list of all points that could not be in a range. This is all range starts and range ends.
  • Check if any of them are not in a range.

This is based on the operation that a point not in a range will be one of the numbers included in either table:

with tc as (
      select t.test, r.candidate
      from tests t join
           (select r.a as candidate from ranges union all
            select r.b from ranges
           ) r
           on r.candiate >= t.a and r.candidate < t.b
      union all
      select t.test, t.a
      from tests t
      union all
      select t.test, t.b
      from tests t
     )
select distinct tc.test
from tc
where not exists (select 1
                  from ranges r
                  where tc.candidate >= r.a and tc.candidate < r.b
                 );

Because ranges include the first item, you don't actually have to check that. So the list of candidates can be reduced:

with tc as (
      select t.test, r.candidate
      from tests t join
           (select r.b as candidate from ranges
           ) r
           on r.candidate >= t.a and r.r < t.b
      union all
      select t.test, t.a
      from tests t
      union all
      select t.test, t.b
      from tests t
     )


回答4:

As described in the accepted answer, the solution is to merge overlapping ranges together, then determine if test range is present inside one of the merged ranges.

Besides join and/or recursion, you can use the sorting approach with window functions to merge overlapping ranges:

WITH ranges(id, a, b) AS (
    SELECT 1,  0, 40 UNION
    SELECT 2, 40, 60 UNION
    SELECT 3, 80, 100 UNION
    SELECT 4, 10, 30
), tests(id, a, b) AS (
    SELECT 1, 10, 90 UNION
    SELECT 2, 10, 60
), ranges_chg AS (
    SELECT *, CASE WHEN MAX(b) OVER (ORDER BY a ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) >= a THEN 0 ELSE 1 END AS chg
    FROM ranges
), ranges_grp AS(
    SELECT *, SUM(chg) OVER (ORDER BY a) AS grp
    FROM ranges_chg
), merged_ranges AS (
    SELECT MIN(a) AS a, MAX(b) AS b
    FROM ranges_grp
    GROUP BY grp
)
SELECT *
FROM tests
WHERE EXISTS (
    SELECT 1
    FROM merged_ranges
    WHERE merged_ranges.a <= tests.a AND tests.b <= merged_ranges.b
)

Result and Fiddle.

| id | a  | b  |
|----|----|----|
| 2  | 10 | 60 |

The data inside range_grp CTE will give you an idea of how it works:

| id | a  | b   | max b over... | chg | grp |
|----|----|-----|---------------|-----|-----|
| 1  | 0  | 40  | NULL          | 1   | 1   |
| 4  | 10 | 30  | 40            | 0   | 1   |
| 2  | 40 | 60  | 40            | 0   | 1   |
| 3  | 80 | 100 | 60            | 1   | 2   |