Oracle SQL - DENSE_RANK

2019-02-24 07:14发布

问题:

I have a Client data table, selected columns of which are shown below:

Row_ID  Client_ID  Status_ID  From_date           To_date
  1     123456       4        20/12/2007 18:02    20/12/2007 18:07
  2     789087       4        20/12/2007 18:02    20/12/2007 18:07
  3     789087       4        20/12/2007 18:07    20/12/2007 18:50
  4     789087       4        20/12/2007 18:50    21/12/2007 10:38
  5     123456       4        20/12/2007 18:07    20/12/2007 18:50
  6     123456       4        20/12/2007 18:50    21/12/2007 10:38
  7     123456       4        21/12/2007 10:38    21/12/2007 16:39
  8     789087       4        21/12/2007 10:38    21/12/2007 17:54
  9     789087       4        21/12/2007 17:54    21/12/2007 18:32
 10     789087       4        21/12/2007 18:32    22/12/2007 06:48
 11     123456       5        21/12/2007 16:39
 12     789087       5        22/12/2007 06:48    22/12/2007 10:53
 13     789087       4        22/12/2007 10:53    22/12/2007 11:51
 14     789087       5        22/12/2007 11:51  

After putting the data into ascending order by Client_ID and then by From_date, my objective is to add a calculated Rank_ID every time there is a change in the status for that client when comparing the status to the previous line. The desired values I want for the Rank_ID are shown below:

Row_ID  Client_ID  Status_ID  From_date           To_date            Rank_ID
  1     123456       4        20/12/2007 18:02    20/12/2007 18:07    1
  5     123456       4        20/12/2007 18:07    20/12/2007 18:50    1
  6     123456       4        20/12/2007 18:50    21/12/2007 10:38    1
  7     123456       4        21/12/2007 10:38    21/12/2007 16:39    1
 11     123456       5        21/12/2007 16:39                        2
  2     789087       4        20/12/2007 18:02    20/12/2007 18:07    3
  3     789087       4        20/12/2007 18:07    20/12/2007 18:50    3
  4     789087       4        20/12/2007 18:50    21/12/2007 10:38    3
  8     789087       4        21/12/2007 10:38    21/12/2007 17:54    3
  9     789087       4        21/12/2007 17:54    21/12/2007 18:32    3
 10     789087       4        21/12/2007 18:32    22/12/2007 06:48    3
 12     789087       5        22/12/2007 06:48    22/12/2007 10:53    4
 13     789087       4        22/12/2007 10:53    22/12/2007 11:51    5
 14     789087       5        22/12/2007 11:51                        6

I am trying to use DENSE_RANK as an analytical function, my "incorrect" SQL code being below

SELECT t1.*, DENSE_RANK () OVER (ORDER BY t1.client_id, t1.status_id) rank_id
FROM (SELECT c.client_ID, c.status_id, c.from_date, c.to_date
      FROM client c
      ORDER BY c.client_id, c.from_date) t1
ORDER BY t1.client_id, t1.from_date

However, the problem I am encountering is that it given the SQL code as written it calculates the Rank_ID as follows:

Row_ID  Client_ID  Status_ID  From_date           To_date            Rank_ID
  1     123456       4        20/12/2007 18:02    20/12/2007 18:07    1
  5     123456       4        20/12/2007 18:07    20/12/2007 18:50    1
  6     123456       4        20/12/2007 18:50    21/12/2007 10:38    1
  7     123456       4        21/12/2007 10:38    21/12/2007 16:39    1
 11     123456       5        21/12/2007 16:39                        2
  2     789087       4        20/12/2007 18:02    20/12/2007 18:07    3
  3     789087       4        20/12/2007 18:07    20/12/2007 18:50    3
  4     789087       4        20/12/2007 18:50    21/12/2007 10:38    3
  8     789087       4        21/12/2007 10:38    21/12/2007 17:54    3
  9     789087       4        21/12/2007 17:54    21/12/2007 18:32    3
 10     789087       4        21/12/2007 18:32    22/12/2007 06:48    3
 12     789087       5        22/12/2007 06:48    22/12/2007 10:53    4
 13     789087       4        22/12/2007 10:53    22/12/2007 11:51    3
 14     789087       5        22/12/2007 11:51                        4

For record 13, the returned Rank_ID is 3 (whereas I want 5 at it is a change in status for that client when compared to the status on the previous record for that client) and for record 14 the returned Rank_ID is 4, whereas I want 6 as it is again a change in status for that client when compared to the previous line.

I guess the problem is that my SQL orders the data by Client_ID and then by Status_ID, and so I can see why it produces the answers it gives. The problem is no matter what changes I make to the DENSE_RANK line, I can't obtain the answer I want.

Any help would be appreciated.

回答1:

As I understood, this is what you need:

select client_ID, status_id, from_date, to_date, 
       sum(start_of_group) over (order by client_ID, from_date) + 1 rank
  from (SELECT c.client_ID, c.status_id, c.from_date, c.to_date,
               case when lag(c.client_ID, 1, c.client_ID) over (order by c.client_ID, c.from_date) = c.client_ID 
                     and lag(c.status_id, 1, c.status_id) over (order by c.client_ID, c.from_date) = c.status_id
                    then 0 else 1 end start_of_group
          FROM client c)
 order by client_ID, from_date

SQLFiddle



回答2:

the thing is that you need to partition your ranking over the CHANGE of the status, not the VALUE of the status. I'm leaving some extra columns in the output so that you can see how it is all derived:

WITH dat as (
 SELECT 1 row_id,     123456 client_id,      4 status,       to_date('20/12/2007 18:02','dd/mm/yyyy hh24:mi') frdate,    to_date('20/12/2007 18:07','dd/mm/yyyy hh24:mi') todate from dual union all
 SELECT   2 row_id,     789087  client_id,       4 status,        to_date('20/12/2007 18:02','dd/mm/yyyy hh24:mi') frdate,    to_date('20/12/2007 18:07','dd/mm/yyyy hh24:mi') todate from dual union all
 SELECT   3 row_id,     789087 client_id,        4  status,       to_date('20/12/2007 18:07','dd/mm/yyyy hh24:mi') frdate,    to_date('20/12/2007 18:50','dd/mm/yyyy hh24:mi') todate from dual union all
 SELECT   4 row_id,     789087 client_id,        4 status,        to_date('20/12/2007 18:50','dd/mm/yyyy hh24:mi') frdate,    to_date('21/12/2007 10:38','dd/mm/yyyy hh24:mi') todate from dual union all
 SELECT   5  row_id,    123456 client_id,        4 status,        to_date('20/12/2007 18:07','dd/mm/yyyy hh24:mi') frdate,    to_date('20/12/2007 18:50','dd/mm/yyyy hh24:mi') todate from dual union all
 SELECT   6 row_id,     123456 client_id,        4 status,        to_date('20/12/2007 18:50','dd/mm/yyyy hh24:mi') frdate,    to_date('21/12/2007 10:38','dd/mm/yyyy hh24:mi') todate from dual union all
 SELECT   7  row_id,    123456 client_id,        4 status,        to_date('21/12/2007 10:38','dd/mm/yyyy hh24:mi') frdate,    to_date('21/12/2007 16:39','dd/mm/yyyy hh24:mi') todate from dual union all
 SELECT   8 row_id,     789087  client_id,       4  status,       to_date('21/12/2007 10:38','dd/mm/yyyy hh24:mi') frdate,    to_date('21/12/2007 17:54','dd/mm/yyyy hh24:mi') todate from dual union all
 SELECT   9 row_id,     789087  client_id,       4 status,        to_date('21/12/2007 17:54','dd/mm/yyyy hh24:mi') frdate,    to_date('21/12/2007 18:32','dd/mm/yyyy hh24:mi') todate from dual union all
 SELECT  10 row_id,     789087 client_id,        4 status,        to_date('21/12/2007 18:32','dd/mm/yyyy hh24:mi') frdate,    to_date('22/12/2007 06:48','dd/mm/yyyy hh24:mi') todate from dual union all
 SELECT  11 row_id,     123456  client_id,       5 status,        to_date('21/12/2007 16:39','dd/mm/yyyy hh24:mi') frdate,    null from dual union all
 SELECT  12 row_id,     789087 client_id,        5 status,        to_date('22/12/2007 06:48','dd/mm/yyyy hh24:mi') frdate,    to_date('22/12/2007 10:53','dd/mm/yyyy hh24:mi') todate from dual union all
 SELECT  13 row_id,     789087 client_id,        4  status,       to_date('22/12/2007 10:53','dd/mm/yyyy hh24:mi') frdate,    to_date('22/12/2007 11:51','dd/mm/yyyy hh24:mi') todate from dual union all
 SELECT  14 row_id,     789087 client_id,        5 status,        to_date('22/12/2007 11:51','dd/mm/yyyy hh24:mi') frdate,    null from dual)
SELECT t1.*, DENSE_RANK () OVER (ORDER BY t1.client_id, t1.chg_status) rank_id
FROM (select client_id, status, prev_status, sum(case when nvl(prev_status,-1) != status then 1 else 0 end) over (partition by client_id order by frdate) chg_status, frdate, todate
      from (
      SELECT c.client_ID
           , c.status
           , lag(status) over (partition by client_id order by frdate) as prev_status
           , c.frdate
           , c.todate
      FROM dat c
      ORDER BY c.client_id, c.frdate)) t1
ORDER BY t1.client_id, t1.frdate

Returns:

CLIENT_ID, STATUS, PREV_STATUS, CHG_STATUS,  FRDATE,                 TODATE,              RANK_ID
123456,    4,      ,            1,           20/12/2007 6:02:00 PM, 20/12/2007 6:07:00 PM, 1
123456,    4,      4,            1,          20/12/2007 6:07:00 PM, 20/12/2007 6:50:00 PM, 1
123456,    4,      4,            1,          20/12/2007 6:50:00 PM, 21/12/2007 10:38:00 AM, 1
123456,    4,      4,            1,          21/12/2007 10:38:00 AM, 21/12/2007 4:39:00 PM, 1
123456,    5,      4,            2,          21/12/2007 4:39:00 PM,,                       2
789087,    4,      ,            1,           20/12/2007 6:02:00 PM,20/12/2007 6:07:00 PM, 3
789087,    4,      4,            1,          20/12/2007 6:07:00 PM,20/12/2007 6:50:00 PM, 3
789087,    4,      4,            1,          20/12/2007 6:50:00 PM, 21/12/2007 10:38:00 AM, 3
789087,    4,      4,            1,          21/12/2007 10:38:00 AM, 21/12/2007 5:54:00 PM, 3
789087,    4,      4,            1,          21/12/2007 5:54:00 PM, 21/12/2007 6:32:00 PM, 3
789087,    4,      4,            1,          21/12/2007 6:32:00 PM,22/12/2007 6:48:00 AM, 3
789087,    5,      4,            2,          22/12/2007 6:48:00 AM, 22/12/2007 10:53:00 AM, 4
789087,    4,      5,            3,          22/12/2007 10:53:00 AM, 22/12/2007 11:51:00 AM, 5
789087,    5,      4,            4,          22/12/2007 11:51:00 AM,,                      6


回答3:

Simple mark all status changes for each client with 1 (column GRP below). Than add those number using analytic SUM function:

with tab1 as (
select client_id,from_date, status,
 nvl(lag(status) over (partition by client_id  order by from_date),-1) status_lag,
 case when (nvl(lag(status) over (partition by client_id  order by from_date),-1) <> status) then 
 1 end grp
from tst
)
, tab2 as (
select client_id,from_date, status,status_lag, grp,
sum(grp) over (partition by client_id  order by from_date) as RANK
from tab1
)
select * from tab2;

gives as expected

 CLIENT_ID FROM_DATE               STATUS STATUS_LAG        GRP       RANK
---------- ------------------- ---------- ---------- ---------- ----------
      1001 01.10.2015 00:00:00          1         -1          1          1 
      1001 02.10.2015 00:00:00          1          1                     1 
      1001 03.10.2015 00:00:00          2          1          1          2 
      1001 04.10.2015 00:00:00          2          2                     2 
      1001 05.10.2015 00:00:00          3          2          1          3 
      1001 09.10.2015 00:00:00          1          3          1          4 
      1002 12.10.2015 00:00:00          1         -1          1          1 
      1002 13.10.2015 00:00:00          3          1          1          2 
      1002 15.10.2015 00:00:00          3          3                     2

my setup

 create table tst 
 (client_id number,
 from_date date,
 status number);

 insert into tst values (1001, to_date('01-10-15','dd-mm-rr'),1);
 insert into tst values (1001, to_date('02-10-15','dd-mm-rr'),1);
 insert into tst values (1001, to_date('03-10-15','dd-mm-rr'),2);
 insert into tst values (1001, to_date('04-10-15','dd-mm-rr'),2);
 insert into tst values (1001, to_date('05-10-15','dd-mm-rr'),3);
 insert into tst values (1001, to_date('09-10-15','dd-mm-rr'),1);
 insert into tst values (1002, to_date('12-10-15','dd-mm-rr'),1);
 insert into tst values (1002, to_date('13-10-15','dd-mm-rr'),3); 
 insert into tst values (1002, to_date('15-10-15','dd-mm-rr'),3);
 commit;


回答4:

Here's a solution using Tabibitosan:

with client as (select 1 row_id, 123456 client_id, 4 status_id, to_date('20/12/2007 18:02','dd/mm/yyyy hh24:mi') from_date, to_date('20/12/2007 18:07','dd/mm/yyyy hh24:mi') todate from dual union all
                select 2 row_id, 789087 client_id, 4 status_id, to_date('20/12/2007 18:02','dd/mm/yyyy hh24:mi') from_date, to_date('20/12/2007 18:07','dd/mm/yyyy hh24:mi') todate from dual union all
                select 3 row_id, 789087 client_id, 4 status_id, to_date('20/12/2007 18:07','dd/mm/yyyy hh24:mi') from_date, to_date('20/12/2007 18:50','dd/mm/yyyy hh24:mi') todate from dual union all
                select 4 row_id, 789087 client_id, 4 status_id, to_date('20/12/2007 18:50','dd/mm/yyyy hh24:mi') from_date, to_date('21/12/2007 10:38','dd/mm/yyyy hh24:mi') todate from dual union all
                select 5 row_id, 123456 client_id, 4 status_id, to_date('20/12/2007 18:07','dd/mm/yyyy hh24:mi') from_date, to_date('20/12/2007 18:50','dd/mm/yyyy hh24:mi') todate from dual union all
                select 6 row_id, 123456 client_id, 4 status_id, to_date('20/12/2007 18:50','dd/mm/yyyy hh24:mi') from_date, to_date('21/12/2007 10:38','dd/mm/yyyy hh24:mi') todate from dual union all
                select 7 row_id, 123456 client_id, 4 status_id, to_date('21/12/2007 10:38','dd/mm/yyyy hh24:mi') from_date, to_date('21/12/2007 16:39','dd/mm/yyyy hh24:mi') todate from dual union all
                select 8 row_id, 789087 client_id, 4 status_id, to_date('21/12/2007 10:38','dd/mm/yyyy hh24:mi') from_date, to_date('21/12/2007 17:54','dd/mm/yyyy hh24:mi') todate from dual union all
                select 9 row_id, 789087 client_id, 4 status_id, to_date('21/12/2007 17:54','dd/mm/yyyy hh24:mi') from_date, to_date('21/12/2007 18:32','dd/mm/yyyy hh24:mi') todate from dual union all
                select 10 row_id, 789087 client_id, 4 status_id, to_date('21/12/2007 18:32','dd/mm/yyyy hh24:mi') from_date, to_date('22/12/2007 06:48','dd/mm/yyyy hh24:mi') todate from dual union all
                select 11 row_id, 123456 client_id, 5 status_id, to_date('21/12/2007 16:39','dd/mm/yyyy hh24:mi') from_date, null from dual union all
                select 12 row_id, 789087 client_id, 5 status_id, to_date('22/12/2007 06:48','dd/mm/yyyy hh24:mi') from_date, to_date('22/12/2007 10:53','dd/mm/yyyy hh24:mi') todate from dual union all
                select 13 row_id, 789087 client_id, 4 status_id, to_date('22/12/2007 10:53','dd/mm/yyyy hh24:mi') from_date, to_date('22/12/2007 11:51','dd/mm/yyyy hh24:mi') todate from dual union all
                select 14 row_id, 789087 client_id, 5 status_id, to_date('22/12/2007 11:51','dd/mm/yyyy hh24:mi') from_date, null from dual)
select row_id,
       client_id,
       status_id,
       from_date,
       todate,
       dense_rank() over (order by client_id, status_id, grp) rank_id
from   (select row_id,
               client_id,
               status_id,
               from_date,
               todate,
               row_number() over (order by client_id, from_date) - row_number() over (partition by status_id order by client_id, from_date) grp -- this is the tabibitosan step
        from   client) t1
order by client_id, from_date, status_id;

    ROW_ID  CLIENT_ID  STATUS_ID FROM_DATE        TODATE              RANK_ID
---------- ---------- ---------- ---------------- ---------------- ----------
         1     123456          4 20/12/2007 06:02 20/12/2007 06:07          1
         5     123456          4 20/12/2007 06:07 20/12/2007 06:50          1
         6     123456          4 20/12/2007 06:50 21/12/2007 10:38          1
         7     123456          4 21/12/2007 10:38 21/12/2007 04:39          1
        11     123456          5 21/12/2007 04:39                           2
         2     789087          4 20/12/2007 06:02 20/12/2007 06:07          3
         3     789087          4 20/12/2007 06:07 20/12/2007 06:50          3
         4     789087          4 20/12/2007 06:50 21/12/2007 10:38          3
         8     789087          4 21/12/2007 10:38 21/12/2007 05:54          3
         9     789087          4 21/12/2007 05:54 21/12/2007 06:32          3
        10     789087          4 21/12/2007 06:32 22/12/2007 06:48          3
        12     789087          5 22/12/2007 06:48 22/12/2007 10:53          5
        13     789087          4 22/12/2007 10:53 22/12/2007 11:51          4
        14     789087          5 22/12/2007 11:51                           6