Database FK Constraints vs Programmatic FK Constra

2020-04-20 08:55发布

Although I am targeting MySQL/PHP, for the sake of my questions, I'd like to just apply this generally to any relational database that is being used in conjunction with a modern programming language. Another assumption would be that the language is leveraging a modern framework, which, on some level would handle foreign key constraints implicitly or have a means to do so explicitly.

My questions:

  • What are the pros and cons of creating FK constraints in the database itself as opposed to managing them at the application level?

  • From a design standpoint, should they ever both be used together or would that cause conflict?

  • If they should not be used together, what is considered the "best practice" in regards to which approach to use?

Note: This is a design theory question. Because of the wide variety of technology that could be used to satisfy an implementation, I'm not really interested in any specifics regarding an implementation.

3条回答
淡お忘
2楼-- · 2020-04-20 09:44

Just how familiar you are with database design and the foreign key concept in general? FK is a column(s) in one table that identifies a row in another table. (I'm pretty sure you already know this.) So FK constraint is something that exists in DB, not in application. Managing FK constraints in application requires manual coding for the functionalities that are already available in DB. So why would you want to do all that manual labor? Also the DB/application interaction and development is much more difficult because of all that extra manual coding.

Best practice IMHO is to use the tools for what they are created to do. DB takes care of the FKs referential integrity and application doesn't need to concern itself with DBs inner functionalities. However, if referential integrity is your main concern and you're for example using MySQL with MyISAM engine which doesn't support FK constraints then you have to some manual checking in application (or maybe with DB triggers which I am not familiar with). Just keep in mind that when you do all kind of checking in application you still have to access the DB and thus you use more resources than what really is needed if the DB could handle the referential integrity checks. (The easy solution of course would be start using InnoDB engine but I'll stop here before this answer gets too product oriented).

So some the pros for letting the DB handle the FK constraint would be:

  1. You don't have to think about it.
  2. You don't have to manually code anything extra.
  3. Application uses less resources and contains less code and thus...
  4. ... maintaining and developing both the DB and the application is a lot easier (for example the application developers don't need to understand database oriented concepts and functionalities so deeply, let the DB experts do the FK etc. thinking...).
查看更多
Animai°情兽
3楼-- · 2020-04-20 09:48

What are the pros and cons of creating FK constraints in the database itself as opposed to managing them at the application level?

In a concurrent environment, it is surprisingly difficult to implement referential integrity in the application code, such that it is both correct and with good performance.

Unless you very carefully use locking, you are open to race conditions, such as:

  • Imagine there is currently one row in the parent table and no corresponding rows in the child.
  • Transaction T1 inserts a row in the child table, but does not yet commit. It can do that since there is a corresponding row in the parent table.
  • Transaction T2 deletes the parent row. It can do that since there are no child rows from its perspective (T1 hasn't committed yet).
  • T1 and T2 commit.
  • At this point, you have a child row without parent (i.e. broken referential integrity).

To remedy that, you can lock the parent row from both transactions, but that's likely to be less performant compared to the highly optimized FK implemented in the DBMS itself.

On top of that, all your clients have to adhere to the same "locking protocol" (one misbehaving client is enough to currupt the data). And the complexity rapidly raises if you have several levels of nested FKs or diamond-shaped FKs. Even if you implement referential integrity in triggers, you are only solving the "one misbehaving client" problem, but the rest remains.

Another nice thing about database-level FKs is that they usually support referential actions such as ON DELETE CASCADE. And all that is simple and self-documenting, unlike referential integrity burried inside application code.

From a design standpoint, should they ever both be used together or would that cause conflict?

You should always use database-level FKs. You could also use application level "pre-checks" if that benefits your user experience (i.e. you don't want to wait until the actual INSERT/UPDATE/DELETE to warn the user), but you should always code as if the INSERT/UPDATE/DELETE can fail even if your application-level check has passed.

If they should not be used together, what is considered the "best practice" in regards to which approach to use?

As I stated, always use database-level FKs. Optionally, you may also use application-level FKs "on top" of them.


See also: Sql - Indirect Foreign Key

查看更多
家丑人穷心不美
4楼-- · 2020-04-20 09:50

What are the pros and cons of creating FK constraints in the database itself as opposed to managing them at the application level?

Some of the pros of using db-enforced FKs:

  1. Separation of schmea from code.

  2. Making application code smaller

  3. No chance for programmer to mess with FK rules.

  4. Forces other applications that integrate with the db to follow the fk rules.

Some of the cons of having db-enforced FKs.

  1. Not easy to break if you have a special case

  2. If data is not valid, errors could be thrown. Application should be coded to gracefully handle errors such as those (specially batch ones).

  3. Definition of FK with Referential integrity rules must be defined and coded carefully. You don't want to cascade delete 1000000 rows online.

  4. They cause an implicit check, even if you don't want that check to occur because you know the parent row must exist. This has probably a trivial impact on performance. Performance is an issue when loading huge data volumes in batch loads and in OLAP/Data Warehouse systems. Special load tools are used and constraints such as database enforced FKs are usually disabled during the load.

From a design standpoint, should they ever both be used together or would that cause conflict?

You could use them together for a reason. As I mentioned before, you may have special cases in your data that you can't define FKs for. Also, there are certain cases such as many-to-many self referencing relationships between tables that could not be handled by FKs (for some db engines at least).

查看更多
登录 后发表回答