Compare multiple date ranges

2019-04-16 03:24发布

问题:

I am using iReport 3.0.0, PostgreSQL 9.1. For a report I need to compare date ranges from invoices with date ranges in filters and print for every invoice code if a filter range is covered, partially covered, etc. To complicate things, there can be multiple date ranges per invoice code.

Table Invoices

ID  Code    StartDate   EndDate
1   111     1.5.2012    31.5.2012
2   111     1.7.2012    20.7.2012
3   111     25.7.2012   31.7.2012
4   222     1.4.2012    15.4.2012
5   222     18.4.2012   30.4.2012

Examples

Filter: 1.5.2012. - 5.6.2012.
Result that I need to get is:

code 111 - partialy covered 
code 222 - invoice missing

Filter: 1.5.2012. - 31.5.2012.

code 111 - fully covered
code 222 - invoice missing

Filter: 1.6.2012. - 30.6.2012.

code 111 -  invoice missing
code 222 -  invoice missing

回答1:

After clarification in comment.

Your task as I understand it:

Check for all supplied individual date ranges (filter) whether they are are covered by the combined date ranges of sets of codes in your table (invoice).

It can be done with plain SQL, but it is not a trivial task. The steps could be:

  1. Supply date ranges as filters.

  2. Combine date ranges in invoice table per code. Can result in one or more ranges per code.

  3. Look for overlaps between filters and combined invoices

  4. Classify: fully covered / partially covered. Can result in one full coverage, one or two partial coverages or no coverage. Reduce to maximum level of coverage.

  5. Display one row for every combination of (filter, code) with the resulting coverage, in a sensible sort order

Ad hoc filter ranges

WITH filter(filter_id, startdate, enddate) AS (
    VALUES
      (1, '2012-05-01'::date, '2012-06-05'::date) -- list filters here.
     ,(2, '2012-05-01', '2012-05-31')
     ,(3, '2012-06-01', '2012-06-30')
    )
SELECT * FROM filter;

Or put them in a (temporary) table and use the table instead.

Combine overlapping / adjacent date ranges per code

WITH a AS (
    SELECT code, startdate, enddate
          ,max(enddate) OVER (PARTITION BY code ORDER BY startdate) AS max_end
-- Calculate the cumulative maximum end of the ranges sorted by start
    FROM   invoice
    ), b AS (
    SELECT *
          ,CASE WHEN lag(max_end) OVER (PARTITION BY code
                                        ORDER BY startdate) + 2 > startdate
-- Compare to the cumulative maximum end of the last row.
-- Only if there is a gap, start a new group. Therefore the + 2.
           THEN 0 ELSE 1 END AS step
    FROM   a
    ), c AS (
    SELECT code, startdate, enddate, max_end
          ,sum(step) OVER (PARTITION BY code ORDER BY startdate) AS grp
-- Members of the same date range end up in the same grp
-- If there is a gap, the grp number is incremented one step
    FROM   b
    )
SELECT code, grp
      ,min(startdate) AS startdate
      ,max(enddate) AS enddate
FROM   c
GROUP  BY 1, 2
ORDER  BY 1, 2

Alternative final SELECT (may be faster or not, you'll have to test):

SELECT DISTINCT code, grp
          ,first_value(startdate) OVER w AS startdate
          ,last_value(enddate) OVER w AS enddate
FROM   c
WINDOW W AS (PARTITION BY code, grp ORDER BY startdate
             RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) 
ORDER  BY 1, 2;

Combine to one query

WITH 
    -- supply one or more filter values
    filter(filter_id, startdate, enddate) AS (
    VALUES
      (1, '2012-05-01'::date, '2012-06-05'::date) -- cast values in first row
     ,(2, '2012-05-01', '2012-05-31')
     ,(3, '2012-06-01', '2012-06-30')
    )
    -- combine date ranges per code
    ,a AS (
    SELECT code, startdate, enddate
          ,max(enddate) OVER (PARTITION BY code ORDER BY startdate) AS max_end
    FROM   invoice
    ), b AS (
    SELECT *
          ,CASE WHEN (lag(max_end) OVER (PARTITION BY code ORDER BY startdate)
                      + 2) > startdate THEN 0 ELSE 1 END AS step
    FROM   a
    ), c AS (
    SELECT code, startdate, enddate, max_end
          ,sum(step) OVER (PARTITION BY code ORDER BY startdate) AS grp
    FROM   b
    ), i AS ( -- substitutes original invoice table
    SELECT code, grp
          ,min(startdate) AS startdate
          ,max(enddate) AS enddate
    FROM   c
    GROUP  BY 1, 2
    )
    -- match filters
    , x AS (
    SELECT f.filter_id, i.code
            ,bool_or(f.startdate >= i.startdate
              AND f.enddate   <= i.enddate) AS full_cover
    FROM   filter f
    JOIN   i ON i.enddate >= f.startdate
            AND i.startdate <= f.enddate -- only overlapping
    GROUP  BY 1,2
    )
SELECT f.*, i.code
      ,CASE x.full_cover
        WHEN TRUE  THEN 'fully covered'
        WHEN FALSE THEN 'partially covered'
        ELSE            'invoice missing'
       END AS covered
FROM   (SELECT DISTINCT code FROM i) i
CROSS  JOIN filter f -- all combinations of filter and code
LEFT   JOIN x USING (filter_id, code)    -- join in overlapping
ORDER  BY filter_id, code;

Tested and works for me on PostgreSQL 9.1.