sql query to return differences between two tables

2019-01-04 05:09发布

I am trying to compare two tables, SQL Server, to verify some data. I want to return all the rows from both tables where data is either in one or the other. In essence, I want to show all the discrepancies. I need to check three pieces of data in doing so, FirstName, LastName and Product.

I'm fairly new to SQL and it seems like a lot of the solutions I'm finding are over complicating things. I don't have to worry about NULLs.

I started by trying something like this:

SELECT DISTINCT [First Name], [Last Name], [Product Name] FROM [Temp Test Data]
WHERE ([First Name] NOT IN (SELECT [First Name] 
FROM [Real Data]))

I'm having trouble taking this further though.

Thanks!

EDIT:

Based on the answer by @treaschf I have been trying to use a variation of the following query:

SELECT td.[First Name], td.[Last Name], td.[Product Name]
FROM [Temp Test Data] td FULL OUTER JOIN [Data] AS d 
ON td.[First Name] = d.[First Name] AND td.[Last Name] = d.[Last Name] 
WHERE (d.[First Name] = NULL) AND (d.[Last Name] = NULL)

But I keep getting 0 results back, when I know that there is at least 1 row in td that is not in d.

EDIT:

Ok, I think I figured it out. At least in my few minutes of testing it seems to work good enough.

SELECT [First Name], [Last Name]
FROM [Temp Test Data] AS td
WHERE (NOT EXISTS
        (SELECT [First Name], [Last Name]
         FROM [Data] AS d
         WHERE ([First Name] = td.[First Name]) OR ([Last Name] = td.[Last Name])))

This is basically going to tell me what is in my test data that is not in my real data. Which is completely fine for what I need to do.

10条回答
我只想做你的唯一
2楼-- · 2019-01-04 05:41

This will do the trick, similar with Tiago's solution, return "source" table as well.

select [First name], [Last name], max(_tabloc) as _tabloc
from (
  select [First Name], [Last name], 't1' as _tabloc from table1
  union all
  select [First name], [Last name], 't2' as _tabloc from table2
) v
group by [Fist Name], [Last name]
having count(1)=1

result will contain differences between tables, in column _tabloc you will have table reference.

查看更多
何必那么认真
3楼-- · 2019-01-04 05:43

IF you have tables A and B, both with colum C, here are the records, which are present in table A but not in B:

SELECT A.*
FROM A
    LEFT JOIN B ON (A.C = B.C)
WHERE B.C IS NULL

To get all the differences with a single query, a full join must be used, like this:

SELECT A.*, B.*
FROM A
    FULL JOIN B ON (A.C = B.C)
WHERE A.C IS NULL OR B.C IS NULL

What you need to know in this case is, that when a record can be found in A, but not in B, than the columns which come from B will be NULL, and similarly for those, which are present in B and not in A, the columns from A will be null.

查看更多
狗以群分
4楼-- · 2019-01-04 05:44

Simple variation on @erikkallen answer that shows which table the row is present in:

(   SELECT 'table1' as source, * FROM table1
    EXCEPT
    SELECT * FROM table2)  
UNION ALL
(   SELECT 'table2' as source, * FROM table2
    EXCEPT
    SELECT * FROM table1) 

If you get an error

All queries combined using a UNION, INTERSECT or EXCEPT operator must have an equal number of expressions in their target lists.

then it may help to add

(   SELECT 'table1' as source, * FROM table1
    EXCEPT
    SELECT 'table1' as source, * FROM table2)  
UNION ALL
(   SELECT 'table2' as source, * FROM table2
    EXCEPT
    SELECT 'table2' as source, * FROM table1) 
查看更多
我欲成王,谁敢阻挡
5楼-- · 2019-01-04 05:50

There is a performance issue related with the left join as well as full join with large data.

In my opinion this is the best solution:

select [First Name], count(1) e from (select * from [Temp Test Data] union all select * from [Temp Test Data 2]) a group by [First Name] having e = 1
查看更多
【Aperson】
6楼-- · 2019-01-04 05:57

If you want to get which column values are different, you could use Entity-Attribute-Value model:

declare @Data1 xml, @Data2 xml

select @Data1 = 
(
    select * 
    from (select * from Test1 except select * from Test2) as a
    for xml raw('Data')
)

select @Data2 = 
(
    select * 
    from (select * from Test2 except select * from Test1) as a
    for xml raw('Data')
)

;with CTE1 as (
    select
        T.C.value('../@ID', 'bigint') as ID,
        T.C.value('local-name(.)', 'nvarchar(128)') as Name,
        T.C.value('.', 'nvarchar(max)') as Value
    from @Data1.nodes('Data/@*') as T(C)    
), CTE2 as (
    select
        T.C.value('../@ID', 'bigint') as ID,
        T.C.value('local-name(.)', 'nvarchar(128)') as Name,
        T.C.value('.', 'nvarchar(max)') as Value
    from @Data2.nodes('Data/@*') as T(C)     
)
select
    isnull(C1.ID, C2.ID) as ID, isnull(C1.Name, C2.Name) as Name, C1.Value as Value1, C2.Value as Value2
from CTE1 as C1
    full outer join CTE2 as C2 on C2.ID = C1.ID and C2.Name = C1.Name
where
not
(
    C1.Value is null and C2.Value is null or
    C1.Value is not null and C2.Value is not null and C1.Value = C2.Value
)

SQL FIDDLE EXAMPLE

查看更多
狗以群分
7楼-- · 2019-01-04 05:58
(   SELECT * FROM table1
    EXCEPT
    SELECT * FROM table2)  
UNION ALL
(   SELECT * FROM table2
    EXCEPT
    SELECT * FROM table1) 
查看更多
登录 后发表回答