Can any kind soul clarify my doubts with a simple example below and identify the superkey, candidate key and primary key?
I know there are a lot of posts and websites out there explaining the differences between them. But it looks like all are generic definitions.
Example:
Student (StudentNumber, FamilyName, Degree, Major, Grade, PhoneNumber)
So from the above example, I can know StudentNumber
is a primary key.
But as for superkey, I'm a bit confused what combination of attributes could be grouped into the superkey?
As for candidate key, I'm confused by the definition given as any candidate key can qualify as a primary key.
Does it mean that attributes such as PhoneNumber
are a candidate key and can be a primary key? (Assuming that a PhoneNumber
only belongs to one student)
Thanks for any clarification!
A superkey is any set of attributes for which the values are guaranteed to be unique for all possible sets of tuples in a table at all times.
A candidate key is a "minimal" superkey - meaning the smallest subset of superkey attributes which are unique. Removing any attribute from a candidate key would therefore make it non-unique.
A primary key is just a candidate key. There is no difference between a primary key and any other candidate key.
It's not really useful to make assumptions about keys based only on a list of attribute names. You need to know what dependencies are supposed to hold between the attributes. Having said that, my guess is that you are right - StudentNumber is likely a candidate key in your example.
Since you don't want textbook definitions, loosely speaking, a super key is a set of columns that uniquely defines a row.
This set can have one or more elements, and there can be more than one super key for a table. You usually do this through functional dependencies.
In your example, I'm assuming:
In this case, a superkey is any combination that contains the student number.
So the following are superkeys
Now assume, if PhoneNumber is unique (who shares phones these days), then the following are also superkeys (in addition to what I've listed above).
A candidate key is simply the "shortest" superkey. Going back to the 1st list of superkeys (i.e. phone number isn't unique), the shortest superkey is StudentNumber.
The primary key is usually just the candidate key.
Stretching Cambium's answer, if the
PhoneNumber
is also unique along withStudentNumber
thencandidate keys
would be:-{StudentNumber}
,{PhoneNumber}
.Here we can't assume
{StudentNumber,PhoneNumber}
as a singlecandidate key
because if we omit one attribute sayStudentNumber
we still get a unique attribute{PhoneNumber}
thus, violating the definition ofcandidate key
.Primary key:
Choose onecandidate key
out of allcandidate keys
. There are 2candidate keys
so we can choose{StudentNumber}
asprimary key
.Alternate keys:
leftovercandidate keys
, after choosingprimary key
fromcandidate keys
, are alternate keys i.e.{PhoneNumber}
.compound key:
a compound key is a key that consists of two or more attributes that uniquely identify an entity occurrence. A simple key is one that has only one attribute. Compound keys may be composed of other unique simple keys and non-key attributes, but may not include another compound key.composite key:
A composite key contains at least one compound key and one more attribute. Composite keys may also include simple keys and non-key attributes.