SQL JOIN - retrieve MAX DateTime from second table

2019-09-19 08:37发布

问题:

I have issue with creating a proper SQL expression.

I have table TICKET with column TICKETID

TICKETID    
1000
1001

I then have table STATUSHISTORY from where I need to retrieve what was the last time (maximum time) when that ticket entered VENDOR status (last VENDOR status) and when it exited VENDOR status (by exiting VENDOR status I mean the first next INPROG status, but only first INPROG after the VENDOR status, it's always INPROG the next status after VENDOR status). Also it is also possible that VENDOR status for ID does not exist at all in STATUSHISOTRY (then nulls should be returned), but INPROG exists always - it can be before but also and after VENDOR status, if ID is not anymore in VENDOR status. Here is the example of STATUSHISTORY.

ID  TICKETID    STATUS    DATETIME
1   1000        INPROG    01.01.2017 10:00
2   1000        VENDOR    02.01.2017 10:00
3   1000        INPROG    03.01.2017 10:00
4   1000        VENDOR    04.01.2017 10:00
5   1000        INPROG    05.01.2017 10:00
6   1000        HOLD      06.01.2017 10:00
7   1000        INPROG    07.01.2017 10:00
8   1001        INPROG    02.02.2017 10:00
9   1001        VENDOR    03.02.2017 10:00
10  1001        INPROG    04.02.2017 10:00
11  1001        VENDOR    05.02.2017 10:00

So the result when doing the query from TICKET table and doing the JOIN with table STATUSHISTORY should be:

ID     VENDOR_ENTERED      VENDOR_EXITED
1000   04.01.2017 10:00    05.01.2017 10:00
1001   05.02.2017 10:00    null

Because for ID 1000 last VENDOR status was at 04.01.2017 and the first INPROG status after the VENDOR status for that ID was at 05.01.2017 while for ID 1001 the last VENDOR status was at 05.02.2017 and after that INPROG status did not happen yet. If VENDOR did not exist then both columns should be null in result. I am really stuck with this, trying different JOINs but without any progress. Thank you in advance if you can help me.

回答1:

You can do this with window functions. First, assign a "vendor" group to the tickets. You can do this using a cumulative sum counting the number of "vendor" records on or before each record.

Then, aggregate the records to get one record per "vendor" group. And use row numbers to get the most recent records. So:

with vg as (
      select ticket,
             min(datetime) as vendor_entered,
             min(case when status = 'INPROG' then datetime end) as vendor_exitied
      from (select sh.*,
                   sum(case when status = 'VENDOR' then 1 else 0 end) over (partition by ticketid order by datetime) as grp
            from statushistory sh
           ) sh
      group by ticket, grp
     )
select vg.tiketid, vg.vendor_entered, vg.vendor_exited
from (select vg.*,
             row_number() over (partition by ticket order by vendor_entered desc) as seqnum
      from vg
     ) vg
where seqnum = 1;


回答2:

You can aggregate to get max time, then join onto all of the date values higher than that time, and then re-aggregate:

select  a.TicketID,
        a.VENDOR_ENTERED,
        min( EXIT_TIME ) as VENDOR_EXITED
from    (
          select  TicketID,
                  max( DATETIME ) as VENDOR_ENTERED
          from    StatusHistory
          where   Status = 'VENDOR'
          group by TicketID
        ) as a
        left join
        (
          select  TicketID,
                  DATETIME as EXIT_TIME
          from    StatusHistory
          where   Status = 'INPROG'
        ) as b
        on a.TicketID = b.TicketID
        and EXIT_TIME >= a.VENDOR_ENTERED
group by a.TicketID,
        a.VENDOR_ENTERED

DB2 is not supported in SQLfiddle, but a standard SQL example can be found here.



标签: sql db2 max