creating multiple custom aggregation functions

2019-09-09 02:06发布

问题:

I have this table named Account.

A simplified view of this table is as follows:

acct_num | ssn       | branch | open_date    |close_date  | has_product1 |  has_product2      
----------------------------------------------------------------------------------------
0123456  | 123456789 | 01     | 01/01/2000   | NULL       | 1            |  0
0123457  | 123456789 | 02     | 03/05/2004   | NULL       | 0            |  1
1234405  | 322145678 | 04     | 04/16/2016   | 05/01/2016 | 1            |  1
...

Notice how the ssn 123456789 has 2 accounts.

I need to create a new data set that groups the table by acct_num and displays new columns that are calculations based on the rows in each group.

These calculations are diverse in nature though.

The table I need (in this simplified example) is as follows:

ssn       |  home_branch    | date_of_first_membership   |   eligibility_indicator
-----------------------------------------------------------------------------------

Obviously ssn is easy, but the rest are beyond me at the moment.

  • home branch is the value of branch from the row that has the earliest open_date and a non-null close_date.

  • open_dateis just the minimum open_date value in the group.

  • eligibility_status is a 1 if at least 1 open account has_product1 and at least 1 (possibly different) open account has_product2

So the result set that I am expecting from the example above is:

ssn       | home_branch     | date_of_first_membership   | eligibility_indicator
-----------------------------------------------------------------------------------
123456789 | 01              | 01/01/2000                 | 1
322145678 | 04              | 04/16/2016                 | 0

EDIT:

the comments pointed out a contradiction. To resolve this contradiction, I now want to filter out all ssn's that don't have any open accounts.

So, the new expected result set is:

ssn       | home_branch     | date_of_first_membership   | eligibility_indicator
-----------------------------------------------------------------------------------
123456789 | 01              | 01/01/2000                 | 1

回答1:

You can do this with conditional aggregation. The first calculation needs a bit of a trick -- getting the minimum date for a row with no close date:

select ssn,
       max(case when open_date = min_open_date then branch end) as home_branch,
       min(open_date) as date_of_first_membership,
       (case when max(has_product1) > 0 and max(has_product2) > 0
             then 1 else 0
        end) as eligibility_indicator
from (select a.*,
             min(case when close_date is null then open_date end)  over (partition by ssn ) as min_opendate
      from account a
     ) a
group by ssn;


回答2:

tested in 2008 sql server

create table account (
    acct_num varchar(15), 
    ssn int, 
    branch varchar(10), 
    open_date Date, 
    close_date Date, 
    has_product1 int, 
    has_product2 int, 
)
insert into account
values (0123456,123456789,01,'01/01/2000',null, 1,0),
(0123457,123456789,02,'03/05/2004',null, 0,1),
(1234405,322145678 ,04,'04/16/2016','05/01/2016', 1,1)


select *, (select branch from account where open_date = x.date_of_first_membership and ssn = x.ssn) home_branch from (
select ssn, MIN(open_date) date_of_first_membership,
case when close_date is not null then 0 ELSE
case when MAX(has_product1) > 0 and MAX(has_product2) >0 then 1 ELSE 0 end end eligibility_indicator
  from account
  where close_date is null
   group by ssn, close_date
  ) x