I am quickly learning the ins and outs of database design (something that, as of a week ago, was new to me), but I am running across some questions that don't seem immediately obvious, so I was hoping to get some clarification.
The question I have right is about foreign keys. As part of my design, I have a Company table. Originally, I had included address information directly within the table, but, as I was hoping to achieve 3NF, I broke out the address information into its own table, Address. In order to maintain data integrity, I created a row in Company called "addressId" as an INT and the Address table has a corresponding addressId as its primary key.
What I'm a little bit confused about (or what I want to make sure I'm doing correctly) is determining which table should be the master (referenced) table and which should be the child (referencing) table. When I originally set this up, I made the Address table the master and the Company the child. However, I now believe this is wrong due to the fact that there should be only one address per Company and, if a Company row is deleted, I would want the corresponding Address to be removed as well (CASCADE deletion).
I may be approaching this completely wrong, so I would appreciate any good rules of thumb on how to best think about the relationship between tables when using foreign keys. Thanks!
You are not doing this correctly. You should have company Id in the address table not addressid in the company table. This is because the relationship is really one-to-many, one company, more than one possible address (companies often have multiple addresses). That makes company the parent table.
If a company is to have one, and only one address I would either leave the company information in the Company table, OR have a CompanyId column in the Address table, but regardless there doesn't seem to be much utility in that. If the data is truly related to the Company and not used elsewhere, it is still 3NF to have the data there.
If you wanted to have say a "billing Address" and a "Shipping Address" it would make a lot more sense to have an address table that is separate with an AddressId that is an identity column and a CompanyId column that is referenced to the Company table.
However to give you a more general rule, the "Master" is the true "master" of the data. In this case, the master record is a company, therefore its id should be referenced. You need to have a company, before you can have an address.
If you want to delete the address each time you delete a company, this means that the address is directly dependent on the company and keeping the address in the company table does not violate the
3NF
.If the address attributes were dependent on something other than the company, you could put them into address table to make the address management more logically consistent.
Say, you could split the address into
country / region / town / street
parts, and if a part of the company's country gained independence or something, you could change the address merely by changing thecountry
field of the breakaway regions.However, this means that you are interested in addresses as in entities, not attributes, and you should not cascade delete them anymore.
Update:
In the normal forms definitions, the word "dependent" means "dependent in my model"
Say, the company's address is
Wall Street, New York, NY, USA
.If in your model
Wall Street
depends onNew York
which depends onNY
which depends onUSA
, then keeping it in a single table would violate the3NF
.However, if in your model:
Wall Street, New York, CA, USA
is a valid address (which means you are not going to raise an error on this address)It is never a valid situation that you update the company's address only because you are doing the same to some other companies (this means that something like handling the renaming of the streets or merging the regions or doing the other geographical updates is not a part of your normal business rules)
, then the table with the addresses is in
3NF
.From your wish to delete an address each time your are deleting a company I judge that you are not going to track the address dependencies, and, hence, you can keep the address in the companies table.
Think of it as a has or has many relationship. A company definitely has an address (in your example) so it should be the parent table and the address table should reference the company table. If, on the other hand, many different companies shared the same address, it could be the other way round. So it also depends on your needs (the logic you are trying to model).