For reasons beyond my control, I need to join two tables and I need null values to match. The best option I could think of was to spit out a UUID and use that as my comparison value but it seems ugly
SELECT * FROM T1 JOIN T2 ON nvl(T1.SOMECOL,'f44087d5935dccbda23f71f3e9beb491') =
nvl(T2.SOMECOL,'f44087d5935dccbda23f71f3e9beb491')
How can I do better? This is on Oracle if it matters, and the context is an application in which a batch of user-uploaded data has to be compared to a batch of existing data to see if any rows match. In retrospect we should have prevented any of the join columns in either data set from containing nulls, but we didn't and now we have to live with it.
Edit: To be clear, I'm not only concerned with nulls. If the columns are not null I want them to match on their actual values.
I believe you could still could use nvl() for join:
But you will need to add function based indexes on columns col1
Indexes should improve the speed of the join on NVL(..) significantly.
For this sort of task Oracle internally uses an undocumented function sys_op_map_nonnull(), where your query would become:
Undocumented, so be careful if you go this route.
Do you really want to be able to join the tables if a value is null? Can't you just exclude the possible null values in the join predicate? I find it hard to grok that rows in two tables can be related by a null value. If you have 100 nulls in table1.col_a and 100 nulls in table2.col_b, you're going to have 10000 rows returned just for the rows with null. It sounds incorrect.
However, you did say you need it. Can I suggest coalescing the null column into a smaller string as character comparisons are relatively expensive. Even better, coalesce the nulls into an integer if the data in the columns is going to be text. Then you have very quick 'comparisons' and you're unlikely to collide with existing data.
Isn't it the same as checking for presence of nulls in both columns?
or
You can also use CASE to replace the null value in Subqueries, then JOIN the results:
Maybe this would work, but I've never actually tried it: