I know that it does consider ' ' as NULL
, but that doesn't do much to tell me why this is the case. As I understand the SQL specifications, ' ' is not the same as NULL
-- one is a valid datum, and the other is indicating the absence of that same information.
Feel free to speculate, but please indicate if that's the case. If there's anyone from Oracle who can comment on it, that'd be fantastic!
Empty string is the same as NULL simply because its the "lesser evil" when compared to the situation when the two (empty string and null) are not the same.
In languages where NULL and empty String are not the same, one has to always check both conditions.
I believe the answer is that Oracle is very, very old.
Back in the olden days before there was a SQL standard, Oracle made the design decision that empty strings in
VARCHAR
/VARCHAR2
columns wereNULL
and that there was only one sense of NULL (there are relational theorists that would differentiate between data that has never been prompted for, data where the answer exists but is not known by the user, data where there is no answer, etc. all of which constitute some sense ofNULL
).By the time that the SQL standard came around and agreed that
NULL
and the empty string were distinct entities, there were already Oracle users that had code that assumed the two were equivalent. So Oracle was basically left with the options of breaking existing code, violating the SQL standard, or introducing some sort of initialization parameter that would change the functionality of potentially large number of queries. Violating the SQL standard (IMHO) was the least disruptive of these three options.Oracle has left open the possibility that the
VARCHAR
data type would change in a future release to adhere to the SQL standard (which is why everyone usesVARCHAR2
in Oracle since that data type's behavior is guaranteed to remain the same going forward).Tom Kyte VP of Oracle:
Oracle documentation alerts developers to this problem, going back at least as far as version 7.
Oracle chose to represent NULLS by the "impossible value" technique. For example, a NULL in a numeric location will be stored as "minus zero", an impossible value. Any minus zeroes that result from computations will be converted to positive zero before being stored.
Oracle also chose, erroneously, to consider the VARCHAR string of length zero (the empty string) to be an impossible value, and a suitable choice for representing NULL. It turns out that the empty string is far from an impossible value. It's even the identity under the operation of string concatenation!
Oracle documentation warns database designers and developers that some future version of Oracle might break this association between the empty string and NULL, and break any code that depends on that association.
There are techniques to flag NULLS other than impossible values, but Oracle didn't use them.
(I'm using the word "location" above to mean the intersection of a row and a column.)
First of all, null and null string were not always treated as the same by Oracle. A null string is, by definition, a string containing no characters. This is not at all the same as a null. NULL is, by definition, the absence of data.
Five or six years or so ago, null string was treated differently from null by Oracle. While, like null, null string was equal to everything and different from everything (which I think is fine for null, but totally WRONG for null string), at least length(null string) would return 0, as it should since null string is a string of zero length.
Currently in Oracle, length(null) returns null which I guess is O.K., but length(null string) also returns null which is totally WRONG.
I do not understand why they decided to start treating these 2 distinct "values" the same. They mean different things and the programmer should have the capability of acting on each in different ways. The fact that they have changed their methodology tells me that they really don't have a clue as to how these values should be treated.
Example from book