Coalescing date ranges in postgres to eliminate ov

2019-09-07 04:06发布

If I have a postgres table a:

  member |    start   |    end
---------+------------+------------
    1    | 2015-01-01 | 2015-05-01
---------+------------+------------
    1    | 2015-03-01 | 2015-06-01
---------+------------+------------
    2    | 2015-01-01 | 2015-05-01
---------+------------+------------
    2    | 2015-06-01 | 2015-08-01

How would I coalesce dates to eliminate overlapping ranges like this:

  member |    start   |    end
---------+------------+------------
    1    | 2015-01-01 | 2015-06-01
---------+------------+------------
    2    | 2015-01-01 | 2015-05-01
---------+------------+------------
    2    | 2015-06-01 | 2015-08-01

1条回答
趁早两清
2楼-- · 2019-09-07 04:58

In the chop CTE original ranges are "chopped" into smaller, non-intersecting (but possibly adjacent) ranges. They are constructed from all the end points of the original ranges, both start and finish.

Main select works as follows (read it from the inside out):

  1. Adjacent flag for a range is zero when it adjoins the previous range (assuming that ranges are ordered by their start dates).
  2. A cumulative sum of adjacent flags gives us a grouping value: all the adjacent ranges will have the same sum.
  3. Outermost block simply calculates the bounding values for the adjacent groups of ranges.

Black magic of window functions...

with chop as (
  select member,
         pt as start,
         lead(pt) over (partition by member order by pt) finish,
         (
           select count(*)
           from   a
           where  b.member = a.member
           and    b.pt >= a.start
           and    b.pt < a.finish
         ) need_it
  from   (
           select member, start pt from a
           union
           select member, finish pt from a
         ) b
)
-- 3
select member,
       min(start),
       max(finish)
from   (
         -- 2
         select member,
                start,
                finish,
                sum(adjacent) over (partition by member order by start) grp
         from   (
                  -- 1
                  select member,
                         start,
                         finish,
                         case
                           when start <= lag(finish) over (partition by member order by start)
                           then 0
                           else 1
                         end adjacent
                  from   chop
                  where  need_it > 0
                ) t
       ) q
group by member,
         grp
order by member,
         min(start);

I renamed end to finish because end is a keyword.

查看更多
登录 后发表回答