How do I do an EXCEPT
clause (like SQL) in Hive QL
I have 2 tables, and each table is a column of unique ids.
I want to find the list of ids that are only in table 1 but not in table 2
Table 1
apple
orange
pear
Table 2
apple
orange
In SQL you can do an EXCEPT clause (http://en.wikipedia.org/wiki/Set_operations_%28SQL%29) but you can't do that in Hive QL
I don't think there's any built-in way to do this but a
LEFT OUTER JOIN
should do the trick.This selects all Ids from
table1
that do not exist intable2
: