There's a school of thought that null values should not be allowed in a relational database. That is, a table's attribute (column) should not allow null values. Coming from a software development background, I really don't understand this. It seems that if null is valid within the context of the attribute, then it should be allowed. This is very common in Java where object references are often null. Not having an extensive database experience, I wonder if I'm missing something here.
相关问题
- how to define constructor for Python's new Nam
- Keeping track of variable instances
- What means in Dart static type and why it differs
- Object.create() bug?
- std::vector of objects / pointers / smart pointers
相关文章
- 接口B继承接口A,但是又不添加新的方法。这样有什么意义吗?
- Notice: Undefined property - how do I avoid that m
- NameError: name 'self' is not defined, eve
- Implementation Strategies for Object Orientation
- Check if the Type of an Object is inherited from a
- When to use Interfaces in PHP
- Are default parameters bad practice in OOP?
- How to return new instance of subclass while initi
According to strict relational algebra, nulls are not needed. However for any practical project, they are needed.
First, much real-world data is unknown or not applicable and nulls implement that behavior well. Second, they make views and outer joins much more practical.
I agree with many of the answers above and also believe that NULL can be used, where appropriate, in a normalized schema design - particularly where you may wish to avoid using some kind of "magic number" or default value which, in turn, could be misleading!
Ultimately though, I think usage of null needs to be well thought out (rather than by default) to avoid some of the assuptions listed in the answers above, particularly where NULL might be assumed to be 'nothing' or 'empty', 'unknown' or the 'value hasn't been entered yet'.
Don't underestimate the complexity you create by making a field NULLable. For example, the following where clause looks like it will match all rows (bits can only be 1 or 0, right?)
But if the bitfield is NULLable, it will miss some. Or take the following query:
Now if the excludetable contains a null and a 1, this translates to:
But "id <> NULL" is false for any value of id, so this will never return any rows. This catches even experienced database developers by surpise.
Given that most people can be caught off-guard by NULL, I try to avoid it when I can.
This is a huge can of worms, because NULL can mean so many things:
Some of these can be avoided by normalisation, some of them can be avoided by the presence of a value in that column ("N/A"), some of them can be mitigated by having a separate column to explain the presence of the NULL ("N/K", "N/A" etc).
It's also a can of worms because the SQL syntax needed to find them is different to that of non-null values, it's difficult to join on them, and they are generally not included in index entries.
Because of the former reason you're going to find cases where a null is unavoidable.
Because of the latter reason you should still do your best to minimise the number of them.
Regardless, always use NOT NULL constraints to guard against nulls where a value is required.
I would say that Nulls should definitely be used. There is no other right way to represent lack of data. For example, it would be wrong to use an empty string to represent a missing address line, or it would be wrong to use 0 to represent a missing age data item. Because both an empty string and 0 are data. Null is the best way to represent such a scenario.
The main issue with nulls is that they have special semantics that can produce unexpected results with comparisons, aggregates and joins.
Nothing is ever equal to null, and nothing is ever not equal to, greater than or less than null, so you have to set nulls to a placeholder value if you want do any bulk comparison.
This is also a problem on composite keys that might be used in a join. Where the natural key includes a nullable column you might want to consider using a synthetic key.
Nulls can drop out of counts, which may not be the semantics you desire.
Nulls in a column that you can join against will eliminate rows from an inner join. In general this is probably desired behaviour, but it can lay elephant traps for people doing reporting.
There are quite a few other subtleties to nulls. Joe Celko's SQL for Smarties has a whole chapter on the subject and is a good book and worth reading anyway. Some examples of places where nulls are a good solution are:
Optional relationships where a joined entity may or may not be present. Null is the only way to represent an optional relationship on a foreign key column.
Columns that you may wish to use to null to drop out of counts.
Optional numeric (e.g. currency) values that may or may not be present. There is no effective placeholder value for 'not recorded' in number systems (particularly where zero is a legal value), so null is really the only good way to do this.
Some examples of places where you might want to avoid using nulls because they are likely to cause subtle bugs.
'Not Recorded' values on code fields with a FK against a reference table. Use a placeholder value, so you (or some random business analyst down the track) don't inadvertently drop rows out of result sets when doing a query against the database.
Description fields where nothing has been entered - null string (
''
) works fine for this. This saves having to treat the nulls as a special case.Optional columns on a reporting or data warehouse system. For this situation, make a placeholder row for 'Not Recorded' in the dimension and join against that. This simplifies querying and plays nicely with ad-hoc reporting tools.
Again, Celko's book is a good treatment of the subject.