This questions is obviously a homework question. I can't understand my professor and have no idea what he said during the election. I need to make step by step instructions to normalize the following table first into 1NF, then 2NF, then 3NF.
I appreciate any help and instruction.
Looking at the first two rows in your table,
and looking at which columns are tagged "PK" in that table,
and assuming that "PK" stands for "Primary Key",
and looking at the values that appear for those two columns in those two rows,
I would recommend your professor to get the hell out of database teaching and not come back until he got himself educated properly on the subject.
This exercise cannot be taken seriously because the problem statement itself contains hopelessly contradictory information.
(Observe that as a consequence, there simply is not any such thing as a "good" or "right" answer to this question !!!)
Okay, I hope I remember all of them correctly, let's start...
Rules
To make them very short (and not very precise, just to give you a first idea of what it's all about):
Instructions
Examples
NF1
a column "
state
" has values like "WA, Washington". NF1 is violated, because that's two values, abbreviation and name.Solution: To fulfill NF1, create two columns,
STATE_ABBREVIATION
andSTATE_NAME
.NF2
Imagine you've got a table with these 4 columns, expressing international names of car models:
COUNTRY_ID
(numeric, primary key)CAR_MODEL_ID
(numeric, primary key)COUNTRY_NAME
(varchar)CAR_MODEL_NAME
(varchar)The table may have these two data rows:
That says, model "Fox" is called "Fox" in USA, but the same car model is called "Polo" in Germany (don't remember if that's actually true).
NF2 is violated, because the country name does not depend on both car model ID and country ID, but only on the country ID.
Solution: To fulfill NF2, move
COUNTRY_NAME
into a separate table "COUNTRY" with columnsCOUNTRY_ID
(primary key) andCOUNTRY_NAME
. To get a result set including the country name, you'll need to connect the two tables using a JOIN.NF3
Say you've got a table with these columns, expressing climatic conditions of states:
STATE_ID
(varchar, primary key)CLIME_ID
(foreign key, ID of a climate zone like "desert", "rainforest", etc.)IS_MOSTLY_DRY
(bool)NF3 is violated, because IS_MOSTLY_DRY only depends on the CLIME_ID (let's at least assume that), but not on the STATE_ID (primary key).
Solution: to fulfill NF3, put the column
MOSTLY_DRY
into the climate zone table.Here are some thoughts regarding the actual table given in the exercise:
I apply the above mentioned NF rules without to challenge the primary key columns. But they actually don't make sense, as we will see later.
So if you remove all columns which violate NF2 or NF3, only the primary key remains (EMP_ID and DEPT_CD). That remaining part violates the given business rules: this structure would allow an employee to work in multiple departments at the same time.
Let's review it from a distance. Your data model is about employees, departments, skills and the relationships between these entities. If you normalize that, you'll end up with one table for the employees (containing DEPT_CD as a foreign key), one for the departments, one for the skills, and another one for the relationship between employees and skills, holding the "skill years" for each tuple of EMP_ID and SKILL_CD (my teacher would have called the latter an "associative entity").
3NF satisfies only if it is in 2nd normal form and doesnot have any transitive dependency and all the non-key attributes should depend on the primary key.
Transitive dependency: R=(A,B,C). A->B AND B->C THEN A->C
Another oversimplified answer coming up.
In a 3NF relational table, every nonkey value is determined by the key, the whole key, and nothing but the key (so help me Codd ;)).
1NF: The key. This means that if you specify the key value, and a named column, there will be at most one value at the intersection of the row and the column. A multivalue, like a series of values separated by commas, is disallowed, because you can't get directly to the value with just a key and acolumn name.
2NF: The whole key. If a column that is not part of the key is determined by a proper subset of the key columns, then 2NF is being violated.
3NF: And nothing but the key. If a column is determined by some set of non key columns, then 3NF is being violated.