There's a school of thought that null values should not be allowed in a relational database. That is, a table's attribute (column) should not allow null values. Coming from a software development background, I really don't understand this. It seems that if null is valid within the context of the attribute, then it should be allowed. This is very common in Java where object references are often null. Not having an extensive database experience, I wonder if I'm missing something here.
相关问题
- how to define constructor for Python's new Nam
- Keeping track of variable instances
- What means in Dart static type and why it differs
- Object.create() bug?
- std::vector of objects / pointers / smart pointers
相关文章
- 接口B继承接口A,但是又不添加新的方法。这样有什么意义吗?
- Notice: Undefined property - how do I avoid that m
- NameError: name 'self' is not defined, eve
- Implementation Strategies for Object Orientation
- Check if the Type of an Object is inherited from a
- When to use Interfaces in PHP
- Are default parameters bad practice in OOP?
- How to return new instance of subclass while initi
NULL rocks. If it wasn't necessary in some cases, SQL would not have IS NULL and IS NOT NULL as special-case operators. NULL is the root of the conceptual universal, all else is NOT NULL. Use NULLs freely, whenever it may be possible for a data value to be absent but not missed. Default values can only compensate for NULL if they are absolutely correct all of the time. For example, if i have a single-bit field "IsReady" it may make perfect sense for this field to have a default value of false and NULL not be allowed, but this implicitly asserts that we know that the whatever is not ready, when in fact we may have no such knowledge. Chances are, in a workflow scenario, the person who decides ready-or-not just hasn't had the chance to enter their opinion yet, so a default of false could actually be dangerous, leading them to overlook a decision that appears to have been made but was in fact only defaulted.
as an aside, and in reference to the middle-initial example, my father had no middle name, therefore his middle initial would be NULL - not blank, space, or asterisk - except in the Army where his middle initial was NMI = No Middle Initial. How silly was that?
It depends.
As long as you understand why you are allowing
NULL
s in the database (the choice needs to be made on a per-column basis) AND how you will interpret, ignore or otherwise deal with them, they are fine.For instance, a column like
NUM_CHILDREN
- what do you do if you don't know the answer - it should beNULL
. In my mind, there is no other best option for this column's design (even if you have a flag to determine whether theNUM_CHILDREN
column is valid, you still have to have a value in this column).On the other hand, if you don't allow
NULL
s and have special reserved values for certain cases (instead of flags), like -1 for number of children when it is really unknown, you have to address these in a similar way, in terms of conventions, documentation, etc.So, ultimately, the issues have to be addressed with conventions, documentation and consistency.
The alternative, as apparently espoused by Adam Davis in the above answer, of normalizing the columns out to sparse (or not so sparse, in the case of the
NUM_CHILDREN
example or any example where most of the data has known values) tables, while able to eliminate all NULLs, is non-workable in general practice.In many cases where an attribute is unknown, it makes little sense to join to another table for each and every column which could allow
NULL
s in a simpler design. The overhead of joins, the space requirements for theprimary keys make little sense in the real world.This brings to mind the way duplicate rows can be eliminated by adding a cardinality column, while it theoretically solves the problem of not having a unique key, in practice that is sometimes impossible - for instance, in large scale data. The purists are then quick to suggest a surrogate PK instead, yet the idea that a meaningless surrogate can form part of a tuple (row) in a relation (table) is laughable from the point of view of the relational theory.
It's absolutely fine with null.
My controversial opinion for the day - the default of allowing NULLs in database columns was probably the worst universally accepted design decision in all of RDBMs land. Every vendor does it, and it's wrong. NULLs are fine in certain, specific, well thought out instances, but the idea that you have to explicitly disallow NULLs for every column makes negligent nullability way more common than it should be.
Instead of writing up all the issues of NULL, and tristate vs boolean logic, etc. - I'll offer this pithy advice:
Don't allow NULL in your columns, until you find yourself adding a magic value to represent missing or incomplete data.
Since you're asking this question, you should be very careful in how you approach NULL. There's a lot of nonobvious pitfalls to it. When in doubt, don't use NULL.
There is nothing wrong with using NULL for data fields. You have to be careful when setting keys to null. Primary keys should never be NULL. Foreign keys can be null but you have to be careful not to create orphan records.
If something is "non existent" then you should use NULL instead of an empty string or other kind of flag.