Hive QL Except clause

2019-02-12 16:34发布

问题:

How do I do an EXCEPT clause (like SQL) in Hive QL

I have 2 tables, and each table is a column of unique ids.

I want to find the list of ids that are only in table 1 but not in table 2

Table 1

apple
orange
pear

Table 2

apple
orange

In SQL you can do an EXCEPT clause (http://en.wikipedia.org/wiki/Set_operations_%28SQL%29) but you can't do that in Hive QL

回答1:

I don't think there's any built-in way to do this but a LEFT OUTER JOIN should do the trick.

This selects all Ids from table1 that do not exist in table2:

SELECT t1.id FROM table1 t1 LEFT OUTER JOIN table2 t2 ON (t1.id=t2.id) WHERE t2.id IS NULL;


标签: hive