Aggregate functions on multiple tables not giving

2019-08-19 14:07发布

I have following query which works perfectly when join is done on two tables(customers and orders). All calculations are right.

   SELECT
         Customers.EmailAddress
         ,COUNT(Orders.OrderID)   AS 'overall NumOrders'
         ,SUM(Orders.PaymentAmount)  AS 'overall TotalOrdered'
           ,COUNT(case when Orders.OrderDate >= '20170101' then Orders.OrderID  end) AS '2017 NumOrders'
        ,SUM(  case when Orders.OrderDate >= '20170101' then Orders.PaymentAmount end) AS '2017 TotalOrdered'
       ,COUNT(case when Orders.OrderDate BETWEEN ' 01/01/2015 00:00' AND '12/31/2015 23:59' THEN Orders.OrderID   end) AS '2015 NumOrders'
        ,SUM(case when Orders.OrderDate BETWEEN ' 01/01/2015 00:00' AND '12/31/2015 23:59' then Orders.PaymentAmount end) AS '2015 TotalOrdered'
         FROM Customers
     JOIN Orders ON Customers.Customerid = Orders.Customerid
    WHERE
    Orders.OrderStatus NOT IN ('Cancelled','Payment Declined')
    AND Orders.OrderDate BETWEEN ' 01/01/2015 00:00' AND getdate()
    GROUP BY 
          Customers.EmailAddress

correct Results:

Emailaddress|overallnumorders|overalltotalordered|2017numorder|2017totalordered| 2015numorder|2015totalordered
xyz@gmail.com 1                    23.99            0             0                      1               23.99

But when i add 3 tables to the above query (customers,orders,orderdetails) I am getting doubled up values

SELECT
     Customers.EmailAddress
     ,COUNT(Orders.OrderID)   AS 'overall NumOrders'
     ,SUM(Orders.PaymentAmount)  AS 'overall TotalOrdered'
SUM((OrderDetails.Vendor_Price) * (OrderDetails.Quantity) ) AS TotalCost,
       ,COUNT(case when Orders.OrderDate >= '20170101' then Orders.OrderID  end) AS '2017 NumOrders'
    ,SUM(  case when Orders.OrderDate >= '20170101' then Orders.PaymentAmount end) AS '2017 TotalOrdered'
   ,COUNT(case when Orders.OrderDate BETWEEN ' 01/01/2015 00:00' AND '12/31/2015 23:59' THEN Orders.OrderID   end) AS '2015 NumOrders'
    ,SUM(case when Orders.OrderDate BETWEEN ' 01/01/2015 00:00' AND '12/31/2015 23:59' then Orders.PaymentAmount end) AS '2015 TotalOrdered'
     FROM Customers
 JOIN Orders ON Customers.Customerid = Orders.Customerid
JOIN Orderdetails ON Orders.Orderid=Orderdetails.Orderid
WHERE
Orders.OrderStatus NOT IN ('Cancelled','Payment Declined')
AND Orders.OrderDate BETWEEN ' 01/01/2015 00:00' AND getdate()
GROUP BY 
      Customers.EmailAddress

incorrect Results:

    Emailaddress|overallnumorders|overalltotalordered|totalcost|2017numorder|2017totalordered| 2015numorder|2015totalordered
 xyz@gmail.com          2           47.98             11.99          0             0                      2               47.98

why doing the join on third table is changing calculations? where I want correct results as :

Emailaddress|overallnumorders|overalltotalordered|totalcost|2017numorder|2017totalordered| 2015numorder|2015totalordered
    xyz@gmail.com 1                    23.99      11.99      0             0                      1               23.99

1条回答
够拽才男人
2楼-- · 2019-08-19 14:39

When you add another table you can affect the number of rows and when that happens aggregations will be affected too. To avoid this aggregate the detail table so that there can only be one row per order, then the other aggregations will remain consistent.

SELECT
      Customers.EmailAddress
    , COUNT(Orders.OrderID)                                                                                            AS 'overall NumOrders'
    , SUM(Orders.PaymentAmount)                                                                                        AS 'overall TotalOrdered'
    , SUM(od.totalcost) AS totalcost
    , COUNT(CASE WHEN Orders.OrderDate >= '20170101' THEN Orders.OrderID END)                                          AS '2017 NumOrders'
    , SUM(CASE WHEN Orders.OrderDate >= '20170101' THEN Orders.PaymentAmount END)                                      AS '2017 TotalOrdered'
    , COUNT(CASE WHEN Orders.OrderDate BETWEEN ' 01/01/2015 00:00' AND '12/31/2015 23:59' THEN Orders.OrderID END)     AS '2015 NumOrders'
    , SUM(CASE WHEN Orders.OrderDate BETWEEN ' 01/01/2015 00:00' AND '12/31/2015 23:59' THEN Orders.PaymentAmount END) AS '2015 TotalOrdered'
FROM Customers
JOIN Orders ON Customers.Customerid = Orders.Customerid
JOIN (
      SELECT
            Orderid
          , SUM((Vendor_Price) * (Quantity)) AS totalcost
      FROM OrderDetails
      GROUP BY
            Orderid
) od ON Orders.Orderid = od.Orderid
WHERE Orders.OrderStatus NOT IN ('Cancelled', 'Payment Declined')
AND Orders.OrderDate BETWEEN ' 01/01/2015 00:00' AND GETDATE()
GROUP BY
      Customers.EmailAddress

EDIT

Please don't use "23:59" as an end point for a date range, that isn't accurate and can lead to incorrect results. There is a vry simple and more accurate alternative which just requires you to stop using "between". Additionally '12/31/2015 23:59' is NOT a safe way to specify a date/time value. Use '20160101' which IS the safest ate literal format in SQL Server YYYYMMDD .

    , COUNT(CASE WHEN Orders.OrderDate >= '20150101' AND Orders.OrderDate < '20160101' THEN Orders.OrderID END)     AS '2015 NumOrders'
    , SUM(CASE WHEN Orders.OrderDate >='20150101' AND Orders.OrderDate < '20160101' THEN Orders.PaymentAmount END) AS '2015 TotalOrdered'
查看更多
登录 后发表回答