Normalizing a variety of properties with similar d

2019-08-15 11:35发布

问题:

I'm working on setting up a database to house properties related to a game, for example whether or not it supports multiplayer, what genres it is, release dates, etc.

It seems like creating two additional tables (genres, genres_data for example) for each category type isn't nearly as dynamic as it could be. My initial thought was to set it up one of two ways...

Have a games table with skeletal information, then a properties table to list all the properties, and a third table to contain all the data related to the game, where essentially only the columns related to each property are used:

games
-----------
game_id
... relevant data

properties
-----------
property_id
title
type
category

properties_data
---------------
game_id
property_id
bool
min
max
date
text(max255)
longtext

Or, have the games table the same, and have properties include a column name then use columns in the third table:

properties
--------------
property_id
title
type
category
column_name

properties_data
----------------
game_id
title
description
release_date_au
release_date_jp
genre_rpg
genre_fps
platform_360
platform_ps3
platform_pc
has_leaderboards
has_downloadable_content
... etc

What's the practical approach in this sort of scenario where you have a half dozen or so types of data related to a row, but a large number of categories and supporting properties in each category? It doesn't seem efficient to make a table for each property type or category (to me).

回答1:

This appears to be a typical Supertype-subtype cluster of tables. Except you have not identified it as such. So, identify it formally, and normalise the data.

  • place common columns in the Supertype (parent)
  • place all columns specific to each Subtype in a separate child table
  • so that you end up with no optional columns
  • if you do, then you need another level of sub-table.

If you go for "dynamically" populated tables, you lose all the control of static tables in a db with the busienss rules defined in DDL (not code). In that case, be aware that these loosely defined tables are famous for ending up as a mess with no data integrity. Read this recent answer to get some context.

Link to Related Answer

The point is, if you do that "dynamic" stuff, do it properly.

But I do not believe you need to go that far. With an ordinary Rdb, you can maintain full relational power and flexibility. More tables are the nature of an Rdb, nothing to be scared of. Done properly, you will have full control, and speed. Which takes me back to the first para.



回答2:

from your attribute descriptions - it seems like a third option is appropriate.

games
--------
game_id
name
...

release
----------
release_id
game_id
release_date
release_country

genre
---------
genre_id
name

game_genre
-----------
game_id
genre_id