Are nulls in a relational database okay?

2019-01-04 18:55发布

There's a school of thought that null values should not be allowed in a relational database. That is, a table's attribute (column) should not allow null values. Coming from a software development background, I really don't understand this. It seems that if null is valid within the context of the attribute, then it should be allowed. This is very common in Java where object references are often null. Not having an extensive database experience, I wonder if I'm missing something here.

30条回答
别忘想泡老子
2楼-- · 2019-01-04 18:56

NULL rocks. If it wasn't necessary in some cases, SQL would not have IS NULL and IS NOT NULL as special-case operators. NULL is the root of the conceptual universal, all else is NOT NULL. Use NULLs freely, whenever it may be possible for a data value to be absent but not missed. Default values can only compensate for NULL if they are absolutely correct all of the time. For example, if i have a single-bit field "IsReady" it may make perfect sense for this field to have a default value of false and NULL not be allowed, but this implicitly asserts that we know that the whatever is not ready, when in fact we may have no such knowledge. Chances are, in a workflow scenario, the person who decides ready-or-not just hasn't had the chance to enter their opinion yet, so a default of false could actually be dangerous, leading them to overlook a decision that appears to have been made but was in fact only defaulted.

as an aside, and in reference to the middle-initial example, my father had no middle name, therefore his middle initial would be NULL - not blank, space, or asterisk - except in the Army where his middle initial was NMI = No Middle Initial. How silly was that?

查看更多
够拽才男人
3楼-- · 2019-01-04 18:57

It depends.

As long as you understand why you are allowing NULLs in the database (the choice needs to be made on a per-column basis) AND how you will interpret, ignore or otherwise deal with them, they are fine.

For instance, a column like NUM_CHILDREN - what do you do if you don't know the answer - it should be NULL. In my mind, there is no other best option for this column's design (even if you have a flag to determine whether the NUM_CHILDREN column is valid, you still have to have a value in this column).

On the other hand, if you don't allow NULLs and have special reserved values for certain cases (instead of flags), like -1 for number of children when it is really unknown, you have to address these in a similar way, in terms of conventions, documentation, etc.

So, ultimately, the issues have to be addressed with conventions, documentation and consistency.

The alternative, as apparently espoused by Adam Davis in the above answer, of normalizing the columns out to sparse (or not so sparse, in the case of the NUM_CHILDREN example or any example where most of the data has known values) tables, while able to eliminate all NULLs, is non-workable in general practice.

In many cases where an attribute is unknown, it makes little sense to join to another table for each and every column which could allow NULLs in a simpler design. The overhead of joins, the space requirements for theprimary keys make little sense in the real world.

This brings to mind the way duplicate rows can be eliminated by adding a cardinality column, while it theoretically solves the problem of not having a unique key, in practice that is sometimes impossible - for instance, in large scale data. The purists are then quick to suggest a surrogate PK instead, yet the idea that a meaningless surrogate can form part of a tuple (row) in a relation (table) is laughable from the point of view of the relational theory.

查看更多
可以哭但决不认输i
4楼-- · 2019-01-04 18:57

It's absolutely fine with null.

查看更多
可以哭但决不认输i
5楼-- · 2019-01-04 18:58

My controversial opinion for the day - the default of allowing NULLs in database columns was probably the worst universally accepted design decision in all of RDBMs land. Every vendor does it, and it's wrong. NULLs are fine in certain, specific, well thought out instances, but the idea that you have to explicitly disallow NULLs for every column makes negligent nullability way more common than it should be.

查看更多
▲ chillily
6楼-- · 2019-01-04 18:59

Instead of writing up all the issues of NULL, and tristate vs boolean logic, etc. - I'll offer this pithy advice:

  1. Don't allow NULL in your columns, until you find yourself adding a magic value to represent missing or incomplete data.

  2. Since you're asking this question, you should be very careful in how you approach NULL. There's a lot of nonobvious pitfalls to it. When in doubt, don't use NULL.

查看更多
一夜七次
7楼-- · 2019-01-04 19:00

There is nothing wrong with using NULL for data fields. You have to be careful when setting keys to null. Primary keys should never be NULL. Foreign keys can be null but you have to be careful not to create orphan records.

If something is "non existent" then you should use NULL instead of an empty string or other kind of flag.

查看更多
登录 后发表回答