In my database, many tables have the 'State' field, representing the state that, that particular entity falls in. I have been told that we should use Lookup tables for this kind of thing, but I am unsure of the exact mechanism. Can someone clarify these points?
How is the integrity maintained? (i.e. how do I make sure that only the values from the state table go into the other tables?)
Does the state name go into the other tables, or does the state id from the state table go into the other tables?
1 - Integrity is maintained using what is called a FOREIGN KEY constraint. A reasonable scenario might have you do these two tables:
This answers your question #2: The state codes, not the full names, go in the CUSTOMER table in this example.
A typical script to impose this kind of structure on an existing layout would be like this:
And this answers your question #1: That "REFERENCES" command will create a Foreign Key constraint that will force all entries in CUSTOMER.STATE_CODE to have a corresponding entry in the STATE_CODE table. After setting this up, if someone were to try this:
Then they would get an error message, and that faulty data would never get entered (unless, of course, you really did have a state with a code of 74837483748).
Answers:
Integrity is maintained by foreign key constraints.
A foreign key constraint ensures that the only values the child table will allow in the specified column, come from the parent table's designated column.
For sake of join/various database operations, the smallest data type possible is recommended because the performance will be better.
For example, INT takes 4 Bytes while VARCHAR2(4+) takes more than that. From a performance perspective, it will be faster if you use INT than VARCHAR2(4+). But you do want two columns - one to serve as primary key while the other is the human readable description. This approach allows you to change the description without impacting existing records.
This leads to a discussion about artificial/surrogate and natural keys, for what is best to use as a primary key (and ultimately a foreign key).