Surrogate vs. natural/business keys [closed]

2018-12-31 05:43发布

Here we go again, the old argument still arises...

Would we better have a business key as a primary key, or would we rather have a surrogate id (i.e. an SQL Server identity) with a unique constraint on the business key field?

Please, provide examples or proof to support your theory.

19条回答
千与千寻千般痛.
2楼-- · 2018-12-31 06:17

Surrogate keys can be useful when business information can change or be identical. Business names don't have to be unique across the country, after all. Suppose you deal with two businesses named Smith Electronics, one in Kansas and one in Michigan. You can distinguish them by address, but that'll change. Even the state can change; what if Smith Electronics of Kansas City, Kansas moves across the river to Kansas City, Missouri? There's no obvious way of keeping these businesses distinct with natural key information, so a surrogate key is very useful.

Think of the surrogate key like an ISBN number. Usually, you identify a book by title and author. However, I've got two books titled "Pearl Harbor" by H. P. Willmott, and they're definitely different books, not just different editions. In a case like that, I could refer to the looks of the books, or the earlier versus the later, but it's just as well I have the ISBN to fall back on.

查看更多
深知你不懂我心
3楼-- · 2018-12-31 06:20

I hate surrogate keys in general. They should only be used when there is no quality natural key available. It is rather absurd when you think about it, to think that adding meaningless data to your table could make things better.

Here are my reasons:

  1. When using natural keys, tables are clustered in the way that they are most often searched thus making queries faster.

  2. When using surrogate keys you must add unique indexes on logical key columns. You still need to prevent logical duplicate data. For example, you can’t allow two Organizations with the same name in your Organization table even though the pk is a surrogate id column.

  3. When surrogate keys are used as the primary key it is much less clear what the natural primary keys are. When developing you want to know what set of columns make the table unique.

  4. In one to many relationship chains, the logical key chains. So for example, Organizations have many Accounts and Accounts have many Invoices. So the logical-key of Organization is OrgName. The logical-key of Accounts is OrgName, AccountID. The logical-key of Invoice is OrgName, AccountID, InvoiceNumber.

    When surrogate keys are used, the key chains are truncated by only having a foreign key to the immediate parent. For example, the Invoice table does not have an OrgName column. It only has a column for the AccountID. If you want to search for invoices for a given organization, then you will need to join the Organization, Account, and Invoice tables. If you use logical keys, then you could Query the Organization table directly.

  5. Storing surrogate key values of lookup tables causes tables to be filled with meaningless integers. To view the data, complex views must be created that join to all of the lookup tables. A lookup table is meant to hold a set of acceptable values for a column. It should not be codified by storing an integer surrogate key instead. There is nothing in the normalization rules that suggest that you should store a surrogate integer instead of the value itself.

  6. I have three different database books. Not one of them shows using surrogate keys.

查看更多
谁念西风独自凉
4楼-- · 2018-12-31 06:23

Surrogate key will NEVER have a reason to change. I cannot say the same about the natural keys. Last names, emails, ISBN nubmers - they all can change one day.

查看更多
心情的温度
5楼-- · 2018-12-31 06:25

Always use a single column, surrogate key if at all possible. This makes joins as well as inserts/updates/deletes much cleaner because you're only responsible for tracking a single piece of information to maintain the record.

Then, as needed, stack your business keys as unique contraints or indexes. This will keep you data integrity intact.

Business logic/natural keys can change, but the phisical key of a table should NEVER change.

查看更多
低头抚发
6楼-- · 2018-12-31 06:26

Alway use a key that has no business meaning. It's just good practice.

EDIT: I was trying to find a link to it online, but I couldn't. However in 'Patterns of Enterprise Archtecture' [Fowler] it has a good explanation of why you shouldn't use anything other than a key with no meaning other than being a key. It boils down to the fact that it should have one job and one job only.

查看更多
临风纵饮
7楼-- · 2018-12-31 06:26

Case 1: Your table is a lookup table with less than 50 types (inserts)

Use business/natural keys. For Example:

Table: JOB with 50 inserts
CODE (primary key)       NAME               DESCRIPTION
PRG                      PROGRAMMER         A programmer is writing code
MNG                      MANAGER            A manager is doing whatever
CLN                      CLEANER            A cleaner cleans
...............
joined with
Table: PEOPLE with 100000 inserts

foreign key JOBCODE in table PEOPLE
looks at
primary key CODE in table JOB

Case 2: Your table is a table with thousands of inserts

Use surrogate/autoincrement keys. For Example:

Table: ASSIGNMENT with 1000000 inserts
joined with
Table: PEOPLE with 100000 inserts

foreign key PEOPLEID in table ASSIGNMENT
looks at
primary key ID in table PEOPLE (autoincrement)

In the first case:

  • You can select all programmers in table PEOPLE without use of join with table JOB, but just with: "SELECT * FROM PEOPLE WHERE JOBCODE = 'PRG'"

In the second case:

  • Your database queries are faster because your primary key is an integer
  • You don't need to bother yourself with finding the next unique key because the database itself gives you the next autoincrement.
查看更多
登录 后发表回答