I'm trying to get my head round this mind boggling stuff they call Database Design without much success, so I'll try to illustrate my problem with an example.
I am using MySQL and here is my question:
Say I want to create a database to hold my DVD collection. I have the following information that I want to include:
- Film Title
- Actors
- Running Time
- Genre
- Description
- Year
- Director
I would like to create relationships between these to make it more efficient but don't know how.
Here is what I'm thinking for the database design:
Films Table => filmid, filmtitle, runningtime, description
Year Table => year
Genre Table => genre
Director Table => director
Actors Table => actor_name
But, how would I go about creating relationships between these tables?
Also, I have created a unique ID for the Films Table with a primary key that automatically increments, do I need to create a unique ID for each table?
And finally if I were to update a new film into the database through a PHP form, how would I insert all of this data in (with the relationships and all?)
thanks for any help you can give, Keith
You don't really need a YearTable, and all you need is a genre_id, director_id, and actor_id columns in your films table.
Also, your genre, director, and actor tables need their own unique IDs.
Edit: This is, of course, assuming that you're only going to have 1 genre, director, and actor for each movie. Which probably isn't the case.
To have many actors belonging to many movies, you will need a seperate relations table. You could call it "moviesActors" (or actorsMovies) and each row will have an actor_id and a movie_id to say this actor was in this movie.
These are the tables I'd use:
Instead of having a directors and actors table, just have one table of people. This can also include crew members (in case you want to track who the 2nd Junior Assistant Dolly Grip was). Each movie can be any number of genres (comedy and horror, for example). Plus, the people can take any number of roles on each film - there are quite a number of actor/directors out there.
The Roles table doesn't necessarily mean the character the actor is playing, but it could. It could be "Director", "Producer", "Actor"... or even "Luke Skywalker" if you wanted to get that finely-grained... I believe IMDB does that.
Hopefully the names of the fields above should hint at the foreign keys, and i've put
_underscores_
around the primary keys I'd use.I realize your question has already been answered, however I wanted to point you to:
http://www.imdb.com/interfaces
IMDB provides flat-text files of their database (minus primary keys). You might find this useful to populate your database once you get going, or you could use it in your program/website to allow you to simply search for a movies title to add to your "DVD Collection", and have the rest of the information pulled from these.
Your Films table also needs links to the genre, director, and actors tables. Since the actors, at least will be many to many (one film will list more than one actor, one actor will be in more than one film), you'll need a table to link them.
Any table that might be many to many needs a linking table.
You have to make a distinction between attributes and entities. An entity is a thing - usually a noun. An attribute is more like a piece of describing information. In database jargon, entity = table, attribute = field/column.
Having a separate table for certain things, let's use director, as an example, is called normalizing. While it can be good in some circumstances, it can be unnecessary in others (as generally it makes queries more complicated - you have to join everything - and it is slower).
In this case, having a year table is unnecessary, since there are no other attributes about a year, besides the year itself, that you would store. It is better to denormalize this and store the year in the film table itself.
Director, on the other hand, is different. Perhaps you'll want to store the director's first name, last name, date of birth, date of death (if applicable), etc. You obviously don't want to enter the director's birth date every time you enter a film that this person directs, so it makes sense to have a separate entity for a director.
Even if you didn't want to store all this information about the director (you just want their name), having a separate table for it (and using a surrogate key - I'll get to that in a second) is useful because it prevents typographic errors and duplicates - if you have someone's name spelled wrong or entered differently (first,last vs last,first), then if you try to find other movies they've directed, you'll fail.
Using a surrogate key (primary key) for tables is generally a good idea. Matching an integer is much faster than matching a string. It also allows you to freely change the name, without worrying about the foreign keys stored in other tables (the ID stays the same, so you don't have to do anything).
You can really take this design quite far, and it's all a matter of figuring out what you want to be able to store in it.
For example, rather than have a single director per film, some films have multiple directors.. so there would be a many-to-many relationship between films and directors, so you'd need a table with eg:
Taking it a step further, sometimes directors are also actors, and vice-versa. So rather than even have director and actor tables, you could have a single person table, and join that table in using a role table. The role table would hold various positions - eg, director, producer, star, extra, grip, editor.. and it would look more like:
You might also have a role_details field in the film_people table, which could contain extra information depending on the role (eg, the name of the part the actor is playing).
I'm also showing genre as a many<>many relationship, because possible a film is in multiple genres. If you didn't want this, then instead of the film_genre table, films would just contain a genreid.
Once this is set up, it is easy to query and find everything a given person has done, or everything a person has done as a director, or everyone who has ever directed a movie, or all the people involved with one specific movie.. It can go on and on.
What follows is not actual MySQL code. It seems like what you need is more of a conceptual start here. So here's a model of what your database should look like.
Actor table
Director table
Genre table
Film table
Actor-film index table
For each actor in the film, you would add a row to the Actor-Film Index. So, if actors 5 and 13 (the primary keys for those actors) starred in film 4 (again, the primary key for that film), you'd have two rows reflecting that fact in your index: One with film id = 4, and actor id = 5, and another with film id = 4, and actor id = 13.
Hope that helps.
Also, this assumes that each film has exactly one director. If any film in your library has two directors (such as Slumdog Millionaire), you'd want to separate out the director id from the film table, and create a Director-Film index like the Actor-Film Index as above.