sql double count in join statement

2019-02-27 22:13发布

问题:

I have these two SQL statements and what I need to do is join them into one statement, but when I do it, result is inappropriate. :/

select 
    nieruchomosci.nieruchomoscnr, count(wynajecia.nieruchomoscNr) as wynajecia 
from 
    nieruchomosci, wynajecia
where 
    nieruchomosci.nieruchomoscnr = wynajecia.nieruchomoscNr
GROUP BY 
    nieruchomosci.nieruchomoscnr;

select 
    nieruchomosci.nieruchomoscnr, count(wizyty.nieruchomoscnr) as wizyty 
from 
    nieruchomosci, wizyty
where 
    wizyty.nieruchomoscnr = nieruchomosci.nieruchomoscnr
GROUP BY 
    nieruchomosci.nieruchomoscnr;

This is how I have joined them:

select 
    nieruchomosci.nieruchomoscnr, 
    count(wynajecia.nieruchomoscNr) as wynajecia, 
    count(wizyty.nieruchomoscnr) as wizyty 
from 
    nieruchomosci, wynajecia, wizyty
where 
    nieruchomosci.nieruchomoscnr = wynajecia.nieruchomoscNr
    and wizyty.nieruchomoscnr = nieruchomosci.nieruchomoscNr
GROUP BY 
    nieruchomosci.nieruchomoscnr;

With this, numbers in 'wynajecia' and 'wizyty' are the same, which is wrong. :/

EDIT:

With this code I get:

A14 8   8
B16 6   6
B17 4   4
B18 4   4
B21 4   4
G01 6   6
L94 10  10

Correct output should be this:

A14 2   4
B16 3   2
B17 2   2
B18 2   2
B21 2   2
G01 3   2
L94 2   5

I managed to get currect one by this code:

select nieruchomosci.nieruchomoscnr,
(select count(wynajecia.nieruchomoscNr) from wynajecia where wynajecia.nieruchomoscNr = nieruchomosci.nieruchomoscnr) as wynajecia,
(select count(wizyty.nieruchomoscNr) from wizyty where wizyty.nieruchomoscNr = nieruchomosci.nieruchomoscnr) as wizyty
from nieruchomosci

But I don't this, that this is proper way to handle the problem.

回答1:

Consider the following scenes:

TableA's record's ID:{1, 2, 3, 4, 5},

TableB's record's ID:{1, 2, 3, 5},

TableC's record's ID:{1, 4, 5}

The result of select * from TableA a join TableB b on (a.ID = b.ID) is {1, 2, 3, 5}

The result of select * from TableA a join TableC c on (a.ID = c.ID) is {1, 4, 5}

The result of select * from TableA a join TableB b on (a.ID = b.ID) join TableC c on (a.ID = c.ID) is {1, 5}

So, if you count Or count after Group By, the record is different.

AS your code, You could used left join and sum(case end) AS following Code:

select 
    nieruchomosci.nieruchomoscnr, 
    sum(case when wynajecia.nieruchomoscNr is null then 0 else 1 end) as wynajecia, 
    sum(case when wizyty.nieruchomoscnr is null then 0 else 1 end) as wizyty 
from 
    nieruchomosci
left join 
    wynajecia on (nieruchomosci.nieruchomoscnr = wynajecia.nieruchomoscNr)
left join 
    wizyty on (wizyty.nieruchomoscnr = nieruchomosci.nieruchomoscNr)
GROUP BY 
    nieruchomosci.nieruchomoscnr;

Addition for your question update:

Now you have TableA with records' ID:{1, 2, 3, 4}

Table B is

TableAID Value
1        B1-1
1        B1-2
2        B2-1
2        B2-2
4        B4-1
4        B4-2

Table C is

TableAID Value
1        C1-1
1        C1-2
1        C1-3
3        C3-1
3        C3-2
4        C4-1

The result of three table join

select * from TableA a join TableB b on (a.ID = b.ID) join TableC c on (a.ID = c.ID)

is

TableAID  TableB_Value TableC_Value
1         B1-1         C1-1
1         B1-1         C1-2
1         B1-1         C1-3
1         B1-2         C1-1
1         B1-2         C1-2
1         B1-2         C1-3
4         B4-1         C4-1
4         B4-2         C4-1

So you get a Count(B.Value):{1->6,4->2}, Count(C.Value) is the same. What you want should be Count(B.Value):{1->2, 2->2, 4->2}, and Count(C.Value):{1->3, 3->2, 4->1}

The code you updated is right. But for better performance, you could use this code:

select nieruchomosci.nieruchomoscnr, ifnull(wynajecia.wynajecia_count, 0) as wynajecia_count, ifnull(wizyty.wizyty_count, 0)
from nieruchomosci
left join
(select wynajecia.nieruchomoscNr, count(*) as wynajecia_count from wynajecia group by  wynajecia.nieruchomoscNr) as wynajecia
on (wynajecia.nieruchomoscNr = nieruchomosci.nieruchomoscnr)
left join
(select wizyty.nieruchomoscNr, count(*) as wizyty_count from wizyty group by wizyty.nieruchomoscNr) as wizyty
on (wizyty.nieruchomoscNr = nieruchomosci.nieruchomoscnr)

The "ifnull" function is for mysql. It should be "nvl" for Oracle and "isnull" for MS SQL.



回答2:

The reason for incorrect count values is that if you join 2 child tables with the same parent table there is a possibility of partial cartesian product. In case you do not want to use sub query use the following structure of sql

SELECT
PARENT_COL,
COUNT(DISTINCT CHILD1_PRIMARY_KEY) AS CHILD1_COUNT,
COUNT(DISTINCT CHILD2_PRIMARY_KEY) AS CHILD2_COUNT
FROM
......

Word of caution: In case there are too many child records for the same parent record, it could result in memory issues.



标签: sql count double