I am trying to retrieve data from 2 table with some conditions. When I just do the inner join with the conditions , I get a huge value (200000 data). But when I group by I get a very less value like (8000 data).
SELECT Tcg.SK_tID, Tcg.SK_ServiceProviderID
INTO #CHDetails
FROM #ClientGroup Tcg
INNER JOIN dbo.Component AS chd ON chd.SK_PID = Tcg.SK_PID
AND chd.SK_ServiceProviderID = Tcg.SK_ServiceProviderID
AND chd.SK_CompID = @CHD
AND chd.ReportDate < @ReportDate
GROUP BY Tcg.SK_PID ,Tcg.SK_ServiceProviderID
Can you please let me know the cause for this. Inner join always takes the common data.
The data in the #ClientGroup table is around 70000 , while data in the dbo.Component is very huge. When I query for common PID and Service provider logically it shoul give me the records equal to or less than #ClientGroup. How is it giving more ?
When I do group by i get 8000. But why should I do group by in a inner join for 2 tables.
The group by is essentially performing a distinct on the result. The reason you have to do this is likely because you have duplicates in both tables.
See this sqlfiddle: http://sqlfiddle.com/#!3/cbdca/2
In it, table1 has 3 rows and table2 has 3 rows. When joined together, they return 9 rows.
if a join is giving you more records than expected that means that your join criterea is not complete. The extreme case for this would be a cartesian join.
check your joining criteria.
Group by combines rows together. Usually it is done to produce aggregate data or produce a list of unique values. In your example, the #ClientGroup table is grouped together and the total number of rows you get returned shrinks. You should be getting less rows than are in #ClientGroup.
It will not reduce the number of rows because The INNER JOIN does not perform DISTINCT instead INNER JOIN finds all matching data from both tables.
Let’s say you have table A and table B. both has column ID.
Table A has total 3 rows in it and all values are unique. values are 1,2,3.
TABLE B has total 300 rows, but only 3 unique values. Values are 1,2,3.
(table B has 100 rows for each unique value)
Now in this example if I do INNER JOIN between Table A and Table B
then every row from A will get join Every matching row in table B.
So
value 1 from Table A will match 100 rows in table B
value 2 from Table A will match 100 rows in table B
value 3 from Table A will match 100 rows in table B
so after doing INNER join I will get total 300 rows.
On the contrary if Table B has completely different values (other than 1,2,3) then INNER JOIN would has produced no result.