Full outer join not returning all rows?

2020-05-09 19:00发布

问题:

I have a table that contains multiple records for multiple dates.

I am trying to see the difference between "date 1" and "date 2" and my full outer join is not returning the data I was expecting.

I know there are 13278 rows on date 1 and 13282 on date 2 - therefore I would expect to see at least 13282 rows, but I get back 13195...which is an INNER JOIN (I tested this).

I am hoping for results like:

001     000123    009    NULL    1000
001     000124    009    1000    1000
001     000125    009    1000    1000
001     000126    009    1000    NULL

but this don't get any of the null rows from either side?

SELECT 
    COALESCE(c.AccountBranch, p.AccountBranch)
    , COALESCE(c.AccountNumber, p.AccountNumber)
    , COALESCE(c.AccountSuffix, p.AccountSuffix)
    , c.PrincipleAmount
    , p.PrincipleAmount
    FROM ADStaging..cb_account_extension_principle_dpd c
    FULL OUTER JOIN ADStaging..cb_account_extension_principle_dpd p
        ON p.AccountBranch = c.AccountBranch
        AND p.AccountNumber = c.AccountNumber
        AND p.AccountSuffix = c.AccountSuffix
WHERE 
    (c.BusinessDataDate IS NULL OR c.BusinessDataDate = @CurrentBusinessDataDate)
    AND
    (p.BusinessDataDate IS NULL OR p.BusinessDataDate = @PreviousBusinessDataDate)

This works - combining the "key" for the join in two separate select statements?

SELECT
      COALESCE(C.Account, P.Account) AS Account
    , COALESCE(C.AccountBranch, P.AccountBranch) as AccountBranch
    , COALESCE(C.AccountNumber, P.AccountNumber) as AccountNumber
    , COALESCE(C.AccountSuffix, P.AccountSuffix) as AccountSuffix
    , P.PrincipleAmount AS PreviousAmount
    , C.PrincipleAmount AS CurrentAmount
    , ISNULL(C.PrincipleAmount, P.PrincipleAmount) - ISNULL(P.PrincipleAmount,0)
FROM 
(SELECT 
    (pd.AccountBranch + pd.AccountNumber + pd.AccountSuffix) AS Account
    , pd.AccountBranch
    , pd.AccountNumber
    , pd.AccountSuffix
    , pd.PrincipleAmount
FROM ADStaging..cb_account_extension_principle_dpd pd
WHERE pd.BusinessDataDate = @CurrentBusinessDataDate) C
FULL OUTER JOIN 
(SELECT 
    (pd.AccountBranch + pd.AccountNumber + pd.AccountSuffix) AS Account
    , pd.AccountBranch
    , pd.AccountNumber
    , pd.AccountSuffix
    , pd.PrincipleAmount
FROM ADStaging..cb_account_extension_principle_dpd pd
WHERE pd.BusinessDataDate = @PreviousBusinessDataDate) P
    ON P.Account = C.Account
WHERE 
    (P.PrincipleAmount IS NULL OR C.PrincipleAmount IS NULL)
    OR
    P.PrincipleAmount <> C.PrincipleAmount

But this doesn't - joining on the combined values - only when they are separate tables?

SELECT 
    COALESCE(c.AccountBranch, p.AccountBranch)
    , COALESCE(c.AccountNumber, p.AccountNumber)
    , COALESCE(c.AccountSuffix, p.AccountSuffix)
    , c.PrincipleAmount
    , p.PrincipleAmount
    FROM ADStaging..cb_account_extension_principle_dpd c
    FULL OUTER JOIN ADStaging..cb_account_extension_principle_dpd p
        ON (p.AccountBranch + p.AccountNumber + p.AccountSuffix)
        = (c.AccountBranch + c.AccountNumber + c.AccountSuffix)
WHERE 
    (c.BusinessDataDate = @CurrentBusinessDataDate)
    AND
    (p.BusinessDataDate = @PreviousBusinessDataDate)

回答1:

The where is killing the outer join.
A column cannot be both null and = to a value.
Put the conditions in the join.

SELECT COALESCE(c.AccountBranch, p.AccountBranch)
     , COALESCE(c.AccountNumber, p.AccountNumber)
     , COALESCE(c.AccountSuffix, p.AccountSuffix)
     , c.PrincipleAmount, p.PrincipleAmount
FROM            cb_account_extension_principle_dpd c
FULL OUTER JOIN cb_account_extension_principle_dpd p
  ON p.AccountBranch = c.AccountBranch
 AND p.AccountNumber = c.AccountNumber 
 AND p.AccountSuffix = c.AccountSuffix
 AND c.BusinessDataDate = @CurrentBusinessDataDate
 AND p.BusinessDataDate = @PreviousBusinessDataDate


回答2:

I think your problem is that in the first query the date isn't part of the match. In your second query the two sets you are joining are filtered on date already, so it mimics what would happen if you matched on date. Say you have the same accountbranch,number,suffix twice but for different dates! That would mean that the full outer join finds a match since you don't have the date in the ON-clause.

Try this out:

use ADStaging;
--your dates go here:
declare @CurrentBusinessDataDate date = '2013-04-21'
, @PreviousBusinessDataDate date = '2013-04-23';
SELECT 
COALESCE(c.AccountBranch, p.AccountBranch)
, COALESCE(c.AccountNumber, p.AccountNumber)
, COALESCE(c.AccountSuffix, p.AccountSuffix)
, c.PrincipleAmount
, p.PrincipleAmount

FROM cb_account_extension_principle_dpd AS c
FULL OUTER JOIN cb_account_extension_principle_dpd AS p
    ON p.AccountBranch = c.AccountBranch
    AND p.AccountNumber = c.AccountNumber
    AND p.AccountSuffix = c.AccountSuffix
    AND p.BusinessDataDate = @PreviousBusinessDataDate
    AND c.BusinessDataDate = @CurrentBusinessDataDate;