What's the better database design: more tables

2019-01-10 01:52发布

A former coworker insisted that a database with more tables with fewer columns each is better then one with fewer tables with more columns each. For example rather than a customer table with name, address, city, state, zip, etc. columns, you would have a name table, an address table, a city table, etc.

He argued this design was more efficient and flexible. Perhaps it is more flexible, but I am not qualified to comment on its efficiency. Even if it is more efficient, I think those gains may be outweighed by the added complexity.

So, are there any significant benefits to more tables with fewer columns over fewer tables with more columns?

18条回答
forever°为你锁心
2楼-- · 2019-01-10 02:44

Like everything else: it depends.

There is no hard and fast rule regarding column count vs table count.

If your customers need to have multiple addresses, then a separate table for that makes sense. If you have a really good reason to normalize the City column into its own table, then that can go, too, but I haven't seen that before because it's a free form field (usually).

A table heavy, normalized design is efficient in terms of space and looks "textbook-good" but can get extremely complex. It looks nice until you have to do 12 joins to get a customer's name and address. These designs are not automatically fantastic in terms of performance that matters most: queries.

Avoid complexity if possible. For example, if a customer can have only two addresses (not arbitrarily many), then it might make sense to just keep them all in a single table (CustomerID, Name, ShipToAddress, BillingAddress, ShipToCity, BillingCity, etc.).

Here's Jeff's post on the topic.

查看更多
ゆ 、 Hurt°
3楼-- · 2019-01-10 02:45

I have a few fairly simple rules of thumb I follow when designing databases, which I think can be used to help make decisions like this....

  1. Favor normalization. Denormalization is a form of optimization, with all the requisite tradeoffs, and as such it should be approached with a YAGNI attitude.
  2. Make sure that client code referencing the database is decoupled enough from the schema that reworking it doesn't necessitate a major redesign of the client(s).
  3. Don't be afraid to denormalize when it provides a clear benefit to performance or query complexity.
  4. Use views or downstream tables to implement denormalization rather than denormalizing the core of the schema, when data volume and usage scenarios allow for it.

The usual result of these rules is that the initial design will favor tables over columns, with a focus on eliminating redundancy. As the project progresses and denormalization points are identified, the overall structure will evolve toward a balance that compromises with limited redundancy and column proliferation in exchange for other valuable benefits.

查看更多
男人必须洒脱
4楼-- · 2019-01-10 02:46

I would argue in favor of more tables, but only up to a certain point. Using your example, if you separated your user's information into two tables, say USERS and ADDRESS, this gives you the flexibility to have multiple addresses per user. One obvious application of this is a user who has separate billing and shipping addresses.

The argument in favor of having a separate CITY table would be that you only have to store each city's name once, then refer to it when you need it. That does reduce duplication, but in this example I think it's overkill. It may be more space efficient, but you'll pay the price in joins when you select data from your database.

查看更多
小情绪 Triste *
5楼-- · 2019-01-10 02:46

There are huge benefits to queries using as few columns as possible. But the table itself can have a large number. Jeff says something on this as well.

Basically, make sure that you don't ask for more than you need when doing a query - performance of queries is directly related to the number of columns you ask for.

查看更多
别忘想泡老子
6楼-- · 2019-01-10 02:48

Each table should only include columns that pertain to the entity that's uniquely identified by the primary key. If all the columns in the database are all attributes of the same entity, then you'd only need one table with all the columns.

If any of the columns may be null, though, you would need to put each nullable column into its own table with a foreign key to the main table in order to normalize it. This is a common scenario, so for a cleaner design, you're likley to be adding more tables than columns to existing tables. Also, by adding these optional attributes to their own table, they would no longer need to allow nulls and you avoid a slew of NULL-related issues.

查看更多
欢心
7楼-- · 2019-01-10 02:49

There are many sides to this, but from an application efficiency perspective mote tables can be more efficient at times. If you have a few tables with a bunch of columns every time the db as to do an operation it has a chance of making a lock, more data is made unavailable for the duration of the lock. If locks get escalated to page and tables (well hopefully not tables :) ) you can see how this can slow down the system.

查看更多
登录 后发表回答