What's the best practice for primary keys in t

2018-12-31 18:36发布

When designing tables, I've developed a habit of having one column that is unique and that I make the primary key. This is achieved in three ways depending on requirements:

  1. Identity integer column that auto increments.
  2. Unique identifier (GUID)
  3. A short character(x) or integer (or other relatively small numeric type) column that can serve as a row identifier column

Number 3 would be used for fairly small lookup, mostly read tables that might have a unique static length string code, or a numeric value such as a year or other number.

For the most part, all other tables will either have an auto-incrementing integer or unique identifier primary key.

The Question :-)

I have recently started working with databases that have no consistent row identifier and primary keys are currently clustered across various columns. Some examples:

  • datetime/character
  • datetime/integer
  • datetime/varchar
  • char/nvarchar/nvarchar

Is there a valid case for this? I would have always defined an identity or unique identifier column for these cases.

In addition there are many tables without primary keys at all. What are the valid reasons, if any, for this?

I'm trying to understand why tables were designed as they were, and it appears to be a big mess to me, but maybe there were good reasons for it.

A third question to sort of help me decipher the answers: In cases where multiple columns are used to comprise the compound primary key, is there a specific advantage to this method vs. a surrogate/artificial key? I'm thinking mostly in regards to performance, maintenance, administration, etc.?

21条回答
泪湿衣
2楼-- · 2018-12-31 18:43

I always use an autonumber or identity field.

I worked for a client who had used SSN as a primary key and then because of HIPAA regulations was forced to change to a "MemberID" and it caused a ton of problems when updating the foreign keys in related tables. Sticking to a consistent standard of an identity column has helped me avoid a similar problem in all of my projects.

查看更多
浪荡孟婆
3楼-- · 2018-12-31 18:45

Tables should have a primary key all the time. When it doesn't it should have been an AutoIncrement fields.

Sometime people omit primary key because they transfer a lot of data and it might slow down (depend of the database) the process. BUT, it should be added after it.

Some one comment about link table, this is right, it's an exception BUT fields should be FK to keep the integrity, and is some case those fields can be primary keys too if duplicate in links is not authorized... but to keep in a simple form because exception is something often in programming, primary key should be present to keep the integrity of your data.

查看更多
余欢
4楼-- · 2018-12-31 18:46

Just an extra comment on something that is often overlooked. Sometimes not using a surrogate key has benefits in the child tables. Let's say we have a design that allows you to run multiple companies within the one database (maybe it's a hosted solution, or whatever).

Let's say we have these tables and columns:

Company:
  CompanyId   (primary key)

CostCenter:
  CompanyId   (primary key, foreign key to Company)
  CostCentre  (primary key)

CostElement
  CompanyId   (primary key, foreign key to Company)
  CostElement (primary key)

Invoice:
  InvoiceId    (primary key)
  CompanyId    (primary key, in foreign key to CostCentre, in foreign key to CostElement)
  CostCentre   (in foreign key to CostCentre)
  CostElement  (in foreign key to CostElement)

In case that last bit doesn't make sense, Invoice.CompanyId is part of two foreign keys, one to the CostCentre table and one to the CostElement table. The primary key is (InvoiceId, CompanyId).

In this model, it's not possible to screw-up and reference a CostElement from one company and a CostCentre from another company. If a surrogate key was used on the CostElement and CostCentre tables, it would be.

The fewer chances to screw up, the better.

查看更多
笑指拈花
5楼-- · 2018-12-31 18:46

There´s no problem in making your primary key from various fields, that's a Natural Key.

You can use a Identity column (associated with a unique index on the candidate fields) to make a Surrogate Key.

That´s an old discussion. I prefer surrogate keys in most situations.

But there´s no excuse for the lack of a key.

RE: EDIT

Yeah, there´s a lot of controversy about that :D

I don´t see any obvious advantage on natural keys, besides the fact that they are the natural choice. You will always think in Name, SocialNumber - or something like that - instead of idPerson.

Surrogate keys are the answer to some of the problems that natural keys have (propagating changes for example).

As you get used to surrogates, it seems more clean, and manageable.

But in the end, you´ll find out that it's just a matter of taste - or mindset -. People "think better" with natural keys, and others don´t.

查看更多
查无此人
6楼-- · 2018-12-31 18:46

You should use a 'composite' or 'compound' primary key that comprises of multiple fields.

This is a perfectly acceptable solution, go here for more info :)

查看更多
余生无你
7楼-- · 2018-12-31 18:47

Natural verses artifical keys is a kind of religious debate among the database community - see this article and others it links to. I'm neither in favour of always having artifical keys, nor of never having them. I would decide on a case-by-case basis, for example:

  • US States: I'd go for state_code ('TX' for Texas etc.), rather than state_id=1 for Texas
  • Employees: I'd usually create an artifical employee_id, because it's hard to find anything else that works. SSN or equivalent may work, but there could be issues like a new joiner who hasn't supplied his/her SSN yet.
  • Employee Salary History: (employee_id, start_date). I would not create an artifical employee_salary_history_id. What point would it serve (other than "foolish consistency")

Wherever artificial keys are used, you should always also declare unique constraints on the natural keys. For example, use state_id if you must, but then you'd better declare a unique constraint on state_code, otherwise you are sure to eventually end up with:

state_id    state_code   state_name
137         TX           Texas
...         ...          ...
249         TX           Texas
查看更多
登录 后发表回答