I'm learning about databases and obviously I have to deal with normalforms. Now I came up with this very simple example; given a relation R with attributes {A, B, C} and functional dependencies {A,B -> C , A -> C}.
The candidate key K for this relation has to be {A, B} (not going into how to find candidate keys). The relation is not in 2NF since the non-key attribute C only depends on A which is a proper subset of K. (I assume 1NF is given even though I can not know the domains of the attributes).
Now to get to 3NF I'll have to use the synthesis-algorithm, so I first find the canonical set of the functional dependencies which would be {A -> C} (also not going into how to find the canonical set). Now to get relations in 3NF I form the new relation R1 which contains attributes {A,C}. Since K is not contained in R1 I have to make a new relation R2 which contains one of the candidate keys (here K).
This leads to the two relations R1 (A,C) and R2 (A,B) and I am done since both R1 and R2 are in 3NF.
Is my working correct? Is there anything else I have to be aware of? Thanks a lot for any suggestions!
EDIT: As a comment pointed out my example is faulty. It would be right with a slightly different relation though, namely R (A,B,C,D) with FD's {A,B -> C, B -> D} .. I'll not go through the rest again since I think I did the algorithm correctly even though the example was false.
Be explicit that you are addressing non-trivial FDs. R also has FD A,B,C -> C, but it's trivial. Always be explicit about just what FDs you are giving. Eg a canonical/minimal cover or all the non-trivial FDs or a cover or just some FDs that you know hold although other non-trivial ones might too, whichever is the case. You probably aren't ever doing the latter, because in general you won't be giving enough information to determine CKs and further normalize.
Drop the "only". It makes the statement unclear and if you mean C doesn't depend on A,B then you are mistaken.
The domains are irrelevant. I suppose you are worried that domains might involve "repeating groups" and/or "non-atomic values". This is based on received non-wisdom. Normalization to higher normal forms is independent of domains.
By definition, a relation's tuple's attribute has a value from a domain. Re: "repeating groups": It can't have any, that's something from pre-relational databases. Re "non-atomic": Codd defined relations as able to have relation-valued domains. He pointed out that the only way that a value could be considered (in the everyday sense) non-atomic in a relational context was to be relation-valued. Ie he defined "atomic" in a relational context to mean not a relation. He defined "normalized" to mean having no relation-valued (ie non-atomic) attributes. (All this in 1970.) Later he defined "1NF" as normalized. And developed "2NF" & "3NF". Then (after Kent & with Boyce) "BCNF". So his use of these terms assumed no relation-valued domains.
But normalization theory is presented independent of domains. Ie it's considered to just be decomposition per problematic JDs. So "1NF" also gets used for just being a relation. And the other "NFs" get used without regard to domains. (Although if there are relation-valued domains then there can be constraints different from but similar to FDs & JDs that cause different but similar anomalies, and that cause constraints and anomalies in components even after decomposition per all problematic JDs.) Regardless of whether a relation has relation-valued domains and regardless of what one means by "1NF" or "normalized" or "normalization", the decomposition procedure you're following to remove problematic JDs from problematic FDs to what you're calling 3NF is independent of domains.
That comment:
The comment is wrong. C depends on A and B and it depends on A only. Because A -> C implies A,B -> C.