Database Design NBA

2019-08-12 20:06发布

问题:

I am new to database design so sorry if this is obvious beginner question. I use python and sqlalchemy though I don't think that is relevant (the sample code below is psuedo code), though may be wrong. I have looked through some previous questions and didn't see this addressed. Anyway, on to the question. The goal here is to develop a database of NBA information which will have info on all games played and also box scores for every each player, for each game. There are a couple ways this DB can be designed.

Game(game_id, date, home_name, away_name, score)
Box_Score(game_id, player_name, date, points, rebounds)

In this situation if I want to get all the games the Los Angeles Lakers played I can just do

query(Game).filter(home_name=="lakers" or away_name=="lakers").all()
query(Box_Score).filter(player_name="kobe bryant")

Here is the second option for how to design this database:

Game(game_id, date, home_name=(foreignkey=Team.team_name), away_name, score)
Box_Score(game_id, player_name=foreignkey=Player.player_name), date, points, rebounds)
Team(team_name, home_games=relationship("Game"))
Player(player_name, box_scores=relationship("Box_Score"))

Then I can do

query(Team).filter(name=="lakers").first().games
query(Player).filter(name=="kobe bryant").first().box_scores

On the one hand it seems like the whole point of using a relational database is to set it up like in situation #2. On the other hand, I am not sure what extra functionality it gives me. So I guess my question is, which design do you recommend? Are there some benefits or disadvantages to either design that will become apparent down the line which I cannot see yet? And if you recommend the simpler design #1 which does not use table relationships, why is it that I am storing a decent amount of related information but don't need to use relational database? Thanks!!

回答1:

The ideal data model for any database is highly subjective. If you are new to database design, you probably will not find the ideal schema until after you have created your application and tested it for an extended period of time. I would recommend reading up on some design basics, particularly Database Normalization, since you would probably benefit from a highly normalized schema, where data can be referenced in many different ways. Highly-normalized databases can suffer in the performance department if very large (which this does not seem like it would be), but you can always de-normalize data through the use of Materialized Views or other methods of caching.