Summing range of dates without counting overlaps i

2020-05-06 21:05发布

I have simple table with start_date and end_date columns in it. These date values may overlap

id    start_date    end_date
1     2011-01-01    2012-04-01
2     2012-05-01    2013-10-01
3     2013-09-01    2014-09-01
4     2013-10-01    2014-08-01
5     2013-12-01    2014-11-01
6     2013-09-01    2014-09-01
7     2015-01-01    2015-11-01

Problem is to find sum in months. Example: id: 2,3,4,5,6 overlap so idea is to take MAX end_date and MIN start_date of 2,3,4,5,6 and add date difference of 1, and of 7.

At this time: I've found how to estimate date difference in months:

PERIOD_DIFF( DATE_FORMAT(end_date, '%Y%m') , DATE_FORMAT(start_date, '%Y%m') )

I know that idea here is to:

  1. Understand whether two dates overlap or not. And if yes, then merge dates accordingly(adjust end and start dates if needed)
  2. Loop through all dates, estimate date difference in months, sum and return final result.

I've been looking for similar questions and couldn't resolve and issue, would be nice if you could help me. I know it is possible to do using some programming language and estimate it there, but wanted to write it using MySQL query.

Thanks

标签: mysql
3条回答
聊天终结者
2楼-- · 2020-05-06 21:24

If you need the total periods for a number of records including the overlaps, then simply sum the period differences per record:

SELECT SUM(PERIOD_DIFF( DATE_FORMAT(end_date, '%Y%m') , DATE_FORMAT(start_date, '%Y%m') )) AS total_periods
FROM table WHERE ...
查看更多
家丑人穷心不美
3楼-- · 2020-05-06 21:34

I did it my own way checking other answers here on Stackoverflow, it should work:

select sum(months)
from (select t.*, 
         @time := if(@sum = 0, 0, period_diff(date_format(start_date, '%Y%m'), date_format(@prevtime, '%Y%m'))) as months,
         @prevtime := start_date,
         @sum := @sum + isstart
  from ((select start_date, 1 as isstart
         from position t
        ) union all
        (select end_date, -1
         from position t
        )
       ) t cross join
       (select @sum := 0, @time := 0, @prevtime := 0) vars 
  order by 1, 2
 ) t
查看更多
Animai°情兽
4楼-- · 2020-05-06 21:38

This is hectic as anything but should get you what you need:

SELECT SUM(PERIOD_DIFF(EXTRACT(YEAR_MONTH FROM a.end_date), EXTRACT(YEAR_MONTH FROM a.start_date))) months
  FROM (
    SELECT MIN(g.start_date) start_date, MAX(g.end_date) end_date 
      FROM (
        SELECT @group_id := @group_id + (@end_date IS NULL OR o.start_date > @end_date) group_id,
               start_date,
               @end_date := DATE(CASE 
                 WHEN (@end_date IS NULL OR o.start_date > @end_date) THEN o.end_date
                 ELSE GREATEST(o.end_date, @end_date)
               END) end_date  
          FROM overlap o
          JOIN (SELECT @group_id := 0, @end_date := NULL) init
      ORDER BY o.start_date ASC  
            ) g
  GROUP BY  g.group_id  
        ) a

The inner-most query groups together your periods in overlapping groups stretching the end_date where appropriate. The end_date flexes as I assumed that there could be periods completely enclosed by the previous.

The next wrapping query extracts the full range from each group.

The outer query sums up the full month diffs for each group. All group diffs are rounded down to the nearest full month by PERIOD_DIFF.

Unfortunately I couldn't test this as SQLFiddle has died on me.

查看更多
登录 后发表回答