SQL one to one relationship vs. single table

2019-02-09 20:21发布

问题:

Consider a data structure such as the below where the user has a small number of fixed settings.

User

[Id] INT IDENTITY NOT NULL,
[Name] NVARCHAR(MAX) NOT NULL,
[Email] VNARCHAR(2034) NOT NULL

UserSettings

[SettingA],
[SettingB],
[SettingC]

Is it considered correct to move the user's settings into a separate table, thereby creating a one-to-one relationship with the users table? Does this offer any real advantage over storing it in the same row as the user (the obvious disadvantage being performance).

回答1:

You would normally split tables into two or more 1:1 related tables when the table gets very wide (i.e. has many columns). It is hard for programmers to have to deal with tables with too many columns. For big companies such tables can easily have more than 100 columns.

So imagine a product table. There is a selling price and maybe another price which was used for calculation and estimation only. Wouldn't it be good to have two tables, one for the real values and one for the planning phase? So a programmer would never confuse the two prices. Or take logistic settings for the product. You want to insert into the products table, but with all these logistic attributes in it, do you need to set some of these? If it were two tables, you would insert into the product table, and another programmer responsible for logistics data would care about the logistic table. No more confusion.

Another thing with many-column tables is that a full table scan is of course slower for a table with 150 columns than for a table with just half of this or less.

A last point is access rights. With separate tables you can grant different rights on the product's main table and the product's logistic table.

So all in all, it is rather rare to see 1:1 relations, but they can give a clearer view on data and even help with performance issues and data access.

EDIT: I'm taking Mike Sherrill's advice and (hopefully) clarify the thing about normalization.

Normalization is mainly about avoiding redundancy and relateded lack of consistence. The decision whether to hold data in only one table or more 1:1 related tables has nothing to do with this. You can decide to split a user table in one table for personal information like first and last name and another for his school, graduation and job. Both tables would stay in the normal form as the original table, because there is no data more or less redundant than before. The only column used twice would be the user id, but this is not redundant, because it is needed in both tables to identify a record.

So asking "Is it considered correct to normalize the settings into a separate table?" is not a valid question, because you don't normalize anything by putting data into a 1:1 related separate table.



回答2:

Creating a new table with 1-1 relationships is not a reasonable solution. You might need to do it sometimes, but there would typically be no reason to have two tables where the user id is the primary key.

On the other hand, splitting the settings into a separate table with one row per user/setting combination might be a very good idea. This would be a three-table solution. One for users, one for all possible settings, and one for the junction table between them.

The junction table can be quite useful. For instance, it might contain the effective and end dates of the setting.

However, this assumes that the settings are "similar" to each other, in a SQL sense. If the settings are different such as:

  • Preferred location as latitude/longitude
  • Preferred time of day to receive an email
  • Flag to be excluded from certain contacts

Then you have a data-type problem when storing them in a table. So, the answer is "it depends". A lot of the answer depends on what the settings look like, how they will be used, and the type of constraints on them.



回答3:

Splitting tables into distinct tables with 1:1 relationships between them is usually not practiced, because :

If the relationship is really 1:1, then integrity enforcement boils down to "inserts being done in all concerned tables, or none at all". Achieving this on the server side requires systems that support deferred constraint checking, and AFAIK that's a feature of the rather high-end systems. So in many cases the 1:1 enforcement is pushed over to the application side, and that approach has its own obvious downsides.

A case when splitting tables is nonetheless advisable, is when there are security perspectives, i.e. when not all columns can be updated by one user. But note that by definition, in such cases the relationship between the tables can never be strictly 1:1 .

(I also suggest you read carefully the discussion between Thorsten/Mike. You used the word 'normalization' but normalization has very little to do with your scenario - except if you were considering 6NF, which I think is rather unlikely.)



回答4:

It makes more sense that your settings are not only in a separate table, but also use a on-to-many relationship between the ID and Settings. This way, you could potentially have a as many (or as few) setting as required.

UserSettings

[Settings_ID]
[User_ID]
[Settings]

In fact, one could make the same argument for the [Email] field.