I remember hearing Joel Spolsky mention in podcast 014 that he'd barely ever used a foreign key (if I remember correctly). However, to me they seem pretty vital to avoid duplication and subsequent data integrity problems throughout your database.
Do people have some solid reasons as to why (to avoid a discussion in lines with Stack Overflow principles)?
I also think that foreign keys are a necessity in most databases. The only drawback (besides the performance hit that comes with having enforced consistence) is that having a foreign key allows people to write code that assumes there is a functional foreign key. That should never be allowed.
For example, I've seen people write code that inserts into the referenced table and then attempts inserts into the referencing table without verifying the first insert was successful. If the foreign key is removed at a later time, that results in an inconsistent database.
You also don't have the option of assuming a specific behavior on update or delete. You still need to write your code to do what you want regardless of whether there is a foreign key present. If you assume deletes are cascaded when they are not, your deletes will fail. If you assume updates to the referenced columns are propogated to the referencing rows when they are not, your updates will fail. For the purposes of writing code, you might as well not have those features.
If those features are turned on, then your code will emulate them anyway and you'll lose a little performance.
So, the summary.... Foreign keys are essential if you need a consistent database. Foreign keys should never be assumed to be present or functional in code that you write.
Bigger question is: would you drive with a blindfold on? That’s how it is if you develop a system without referential constraints. Keep in mind, that business requirements change, application design changes, respective logical assumptions in the code changes, logic itself can be refactored, and so on. In general, constraints in databases are put in place under contemporary logical assumptions, seemingly correct for particular set of logical assertions and assumptions.
Through the lifecycle of an application, referential and data checks constraints police data collection via the application, especially when new requirements drive logical application changes.
To the subject of this listing - a foreign key does not by itself "improve performance", nor does it "degrade performance" significantly from a standpoint of real-time transaction processing system. However, there is an aggregated cost for constraint checking in HIGH volume "batch" system. So, here is the difference, real-time vs. batch transaction process; batch processing - where aggreated cost, incured by constraint checks, of a sequentially processed batch poses a performance hit.
In a well designed system, data consistency checks would be done "before" processing a batch through (nevertheless, there is a cost associated here also); therefore, foreign key constraint checks are not required during load time. In fact all constraints, including foreign key, should be temporarily disabled till the batch is processed.
QUERY PERFORMANCE - if tables are joined on foreign keys, be cognizant of the fact that foreign key columns are NOT INDEXED (though the respective primary key is indexed by definition). By indexing a foreign key, for that matter, by indexing any key, and joining tables on indexed helps with better performances, not by joining on non-indexed key with foreign key constraint on it.
Changing subjects, if a database is just supporting website display/rendering content/etc and recording clicks, then a database with full constraints on all tables is over kill for such purposes. Think about it. Most websites don’t even use a database for such. For similar requirements, where data is just being recorded and not referenced per say, use an in-memory database, which does not have constraints. This doesn’t mean that there is no data model, yes logical model, but no physical data model.
How about maintainability and constancy across application life cycles? Most data has a longer lifespan than the applications that make use of it. Relationships and data integrity are much too important to leave to the hope that the next dev team gets it right in the app code. If you haven't worked on a db with dirty data that doesn't respect the natural relationships, you will. The importance of data integrity will then become very clear.
If you are absolutey sure, that the one underlying database system will not change in the future, I would use foreign keys to ensure data integrity.
But here is another very good real-life reason not to use foreign keys at all:
You are developing a product, which should support different database systems.
If you are working with the Entity Framework, which is able to connect to many different database systems, you may also want to support "open-source-free-of-charge" serverless databases. Not all of these databases may support your foreign key rules (updating, deleting rows...).
This can lead to different problems:
1.) You may run into errors, when the database structure is created or updated. Maybe there will only be silent errors, because your foreign keys are just ignored by the database system.
2.) If you rely on foreign keys, you will propably make less or even no data integrity checks in your business logic. Now, if the new database system does not support these foreign key rules or just behaves in a different way, you have to rewrite your business logic.
You may ask: Who needs different database systems? Well, not everybody can afford or wants a full blown SQL-Server on his machine. This is software, which needs to be maintained. Others already have invested time and money in some other DB system. Serverless database are great for small customers on only one machine.
Nobody knows, how all of these DB systems behave, but your business logic, with integrity checks, always stays the same.
They can make deleting records more cumbersome - you can't delete the "master" record where there are records in other tables where foreign keys would violate that constraint. You can use triggers to have cascading deletes.
If you chose your primary key unwisely, then changing that value becomes even more complex. For example, if I have the PK of my "customers" table as the person's name, and make that key a FK in the "orders" table", if the customer wants to change his name, then it is a royal pain... but that is just shoddy database design.
I believe the advantages in using fireign keys outweighs any supposed disadvantages.
The argument I have heard is that the front-end should have these business rules. Foreign keys "add unnecessary overhead" when you shouldn't be allowing any insertions that break your constraints in the first place. Do I agree with this? No, but that is what I have always heard.
EDIT: My guess is he was referring to foreign key constraints, not foreign keys as a concept.