Is there any difference between these two performance-wise?
-- eliminate duplicates using UNION
SELECT col1,col2,col3 FROM Table1
UNION SELECT col1,col2,col3 FROM Table2
UNION SELECT col1,col2,col3 FROM Table3
UNION SELECT col1,col2,col3 FROM Table4
UNION SELECT col1,col2,col3 FROM Table5
UNION SELECT col1,col2,col3 FROM Table6
UNION SELECT col1,col2,col3 FROM Table7
UNION SELECT col1,col2,col3 FROM Table8
-- eliminate duplicates using DISTINCT
SELECT DISTINCT * FROM
(
SELECT col1,col2,col3 FROM Table1
UNION ALL SELECT col1,col2,col3 FROM Table2
UNION ALL SELECT col1,col2,col3 FROM Table3
UNION ALL SELECT col1,col2,col3 FROM Table4
UNION ALL SELECT col1,col2,col3 FROM Table5
UNION ALL SELECT col1,col2,col3 FROM Table6
UNION ALL SELECT col1,col2,col3 FROM Table7
UNION ALL SELECT col1,col2,col3 FROM Table8
) x
The difference between Union and Union all is that UNION ALL will not eliminate duplicate rows, instead it just pulls all rows from all tables fitting your query specifics and combines them into a table.
A UNION statement effectively does a SELECT DISTINCT on the results set.
If you select Distinct from Union All result set, Then the output will be equal to the Union result set.
Edit:
Performance on CPU cost:
Let me explain with Example:
I have two queries. one is Union another one is Union All
SET STATISTICS TIME ON
GO
select distinct * from (select * from dbo.user_LogTime
union all
select * from dbo.user_LogTime) X
GO
SET STATISTICS TIME OFF
SET STATISTICS TIME ON
GO
select * from dbo.user_LogTime
union
select * from dbo.user_LogTime
GO
SET STATISTICS TIME OFF
I did run the both in same query window in SMSS.
Lets see the Execution Plan in SMSS:
What happens is, The query with Union All and Distinct will take CPU cost more than Query with Union.
Performance on Time:
UNION ALL:
(1172 row(s) affected)
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 39 ms.
UNION:
(1172 row(s) affected)
SQL Server Execution Times:
CPU time = 10 ms, elapsed time = 25 ms.
So Union is much better than the Union All with Distinct in performance-wise
Another on-point example that illustrates the four possible cases:
/* with each case we should expect a return set:
(1) DISTINCT UNION {1,2,3,4,5} - is redundant with case (2)
(2) UNION {1,2,3,4,5} - more efficient?
(3) DISTINCT UNION ALL {1,2,2,3,3,4,4,5}
(4) UNION ALL {1,1,2,2,2,3,3,4,4,5}
*/
declare @t1 table (c1 varchar(15));
declare @t2 table (c2 varchar(15));
insert into @t1 values ('1'),('1'),('2'),('3'),('4');
insert into @t2 values ('2'),('2'),('3'),('4'),('5');
select DISTINCT * from @t1 --case (1)
UNION
select DISTINCT * from @t2 order by c1
select * from @t1 --case (2)
UNION
select * from @t2 order by c1
select DISTINCT * from @t1 --case (3)
UNION ALL
select DISTINCT * from @t2 order by c1
select * from @t1 --case (4)
UNION ALL
select * from @t2 order by c1