I have a list of items. Most of these items will not be in stock. The item table has id, name, description. The quantities of items are stored in another table named inventory. The inventory table has item_id and quantity of items that are in stock.
Do I need a primary key for the inventory table? If so, should I use a serial or composite key? When is it ok for a table to not have a primary key?
Edit: Thank you all for being very informative. I will now always have primary keys except in very rare exceptions. I also learned a bit more about serial vs composite keys.
Always aim to have a primary key.
If you are unsure, have a primary key.
Even if you are 99.99% sure you will not need it, have one. Requirements change as I have learned through experience over many years.
The only examples I can really think of are many-to-many tables with just two foreign_keys and mega-huge (hundreds of millions of rows) tables where every byte counts. But even then a separate, unique, no-business value id key is still strongly recommended.
There's some more great info on this here:
http://weblogs.sqlteam.com/jeffs/archive/2007/08/23/composite_primary_keys.aspx
and here:
http://www.techrepublic.com/article/the-great-primary-key-debate/1045050
here:
http://databases.aspfaq.com/database/what-should-i-choose-for-my-primary-key.html
and here:
Should I use composite primary keys or not?
In your example, I would definitely have one.
The decision to 'not' have one should be based on a very clear need and understanding and actual or predicted (e.g. volume) issues with having one.
One great example of this need comes up when debugging and troubleshooting. Just like having create and update columns in each table (another favorite of mine), this info may not initially be used by/for the front end but boy can it be helpful in tracing and resolving issues. (btw update stamps are often now standard in frameworks like Ruby On Rails which also works well with the convention of every table having an id
field!)
In general: Every table should have a PK. At least every table should have some CLUSTER index.
The PK must not be one special column, but having rows without unique identification in system (RDBMS) is not good practise.
There can be several cases where PK is not required, but that are exceptions in rule.
If item_id is unique in the inventory table, I'd say you're fine with using that as an identifier. A primary key is usually used to uniquely identify a line, but there is no use for an inventory line identity in your case that I can see.
EDIT: As others have noted, generally if you don't have a good reason for a primary key, you'll be well off looking at your table structure to see if you can merge it with another table, in this case probably the items table. I can see cases where that's not an option (for example that you can't change the schema, just add new tables) but it's worth a look.
Do I need a primary key for the inventory table?
Can we assume the data is relational? A relation has no duplicate tuples by definition. SQL allows duplicate rows in a table. Therefore, to ensure no duplicate rows in practice, every table should have at least one unique constraint. To cut a long story short, you better have a good reason for not placing a unique constraint on every candidate key in the table. By definition, zero or one candidate key may be designated 'primary' and which one (if any) should receive this designation is arbitrary.
should I use a serial or composite key?
I think this is a typo. A single-column key is known as a "simple key" and not "serial key". From your description, your Inventory table has a sole candidate candidate key on item_ID which is a simple key
. The only possible composite key is a superkey and, unless it is to be referenced by a foreign key, should not be constrained using a unique constraint.
When is it ok for a table to not have a primary key?
When all candidate keys have been constrained using UNIQUE
constraints or when the table is not intended to hold relational data.
If you have only one inventory row per item - then it will be much cheaper (mean - CPU and IO) that it will be in the same Item table,
if not - it depends. And it will not be a normalized data at all
But as far as I understand the question and if you insist on two tables - yes, its better to have an index on item_id field