How to find most overlapping time period with date

2020-06-19 10:04发布

Suppose you have a table with an identifier, a start time and an end time. These start and end times can be any length of time. The start time is always before the end time. Assume there are no nulls.

What kind of query would tell me the most "popular" time, i.e. where the two ranges in each row overlaps with the most other rows?

The real life application of this is that it's a table recording users' sign in and sign out times. I want to write a query that will tell me when the most concurrent users were logged in and see what period of time this was.

Thank you.

2条回答
Luminary・发光体
2楼-- · 2020-06-19 10:26

There are several approaches to this. One uses correlated subqueries. That isn't much fun. Instead, let's use the cumulative sum method because you have Oracle.

The key is to start with a list of timestamps with a value of +1 for a start and -1 for an end. This is easy:

select t.*
from ((select starttime as thetime, 1 as value from table t) union all
      (select endtime, -1 as value from table t)
     ) t

Now, the cumulative sum of the value tells you the number of active overlaps at any given time:

select t.*, sum(value) over (order by thetime) as numactives
from ((select starttime as thetime, 1 as value from table t) union all
      (select endtime, -1 as value from table t)
     ) t

This solves your problem. You probably want to add an order by numactives desc for the specific times.

查看更多
三岁会撩人
3楼-- · 2020-06-19 10:37

Here's an example solution using a simple self-join and a GROUP BY:

WITH d(id, t1, t2) AS (
    SELECT 1, date '2010-01-01', date '2010-03-01' FROM DUAL UNION ALL
    SELECT 2, date '2010-02-01', date '2010-04-01' FROM DUAL UNION ALL
    SELECT 3, date '2010-02-01', date '2010-04-01' FROM DUAL UNION ALL
    SELECT 4, date '2010-01-01', date '2010-01-03' FROM DUAL UNION ALL
    SELECT 5, date '2011-01-01', date '2011-02-15' FROM DUAL
)
SELECT d1.id, d1.t1, d1.t2, 
       COUNT(*) "Overlap count", 
       LISTAGG('[' || d2.t1 || ', ' || d2.t2 || ']', ', ')
       WITHIN GROUP (ORDER BY d2.id) "Overlapping intervals"
FROM d d1 
LEFT OUTER JOIN d d2 
ON d2.t1 <= d1.t2 AND d1.t1 <= d2.t2
GROUP BY d1.id, d1.t1, d1.t2
ORDER BY COUNT(*) DESC

The "Overlapping intervals" aggregation is for illustration only.

SQLFiddle

... with output:

| ID | OVERLAP COUNT |                                                                          OVERLAPPING INTERVALS |
|----|---------------|------------------------------------------------------------------------------------------------|
|  1 |             4 | [01-JAN-10, 01-MAR-10], [01-FEB-10, 01-APR-10], [01-FEB-10, 01-APR-10], [01-JAN-10, 03-JAN-10] |
|  2 |             3 |                         [01-JAN-10, 01-MAR-10], [01-FEB-10, 01-APR-10], [01-FEB-10, 01-APR-10] |
|  3 |             3 |                         [01-JAN-10, 01-MAR-10], [01-FEB-10, 01-APR-10], [01-FEB-10, 01-APR-10] |
|  4 |             2 |                                                 [01-JAN-10, 01-MAR-10], [01-JAN-10, 03-JAN-10] |
|  5 |             1 |                                                                         [01-JAN-11, 15-FEB-11] |
查看更多
登录 后发表回答