What is Super key, Candidate key and Primary key?

1. Super Key

A super key is an attribute or a set of attributes within a table that uniquely identifies each row in that table. A super key can consist of a single column or multiple columns combined. The defining characteristic of a super key is its ability to ensure that no two rows have the same set of values for those columns.

It’s worth noting that a super key may contain additional attributes that are not necessary for uniqueness. In essence, any key that can uniquely identify a row can be considered a super key, including the primary key and any candidate keys along with any other columns added to it.

2. Candidate Key

A candidate key is a minimal super key, which means it is a set of attributes that can uniquely identify a tuple (row) in a table and does not have any extraneous data. In other words, it is a super key from which no attribute can be removed without losing the property of uniqueness.

Every table can have one or more candidate keys, and each candidate key can potentially serve as the table’s primary key. The minimality aspect is crucial; every attribute in a candidate key must be necessary to maintain that key’s uniqueness.

3. Primary Key

Among the candidate keys, one is chosen to be the primary key of the table. The primary key serves as the main means of identifying each row uniquely in a table. Once set, the primary key constraint enforces the uniqueness of its values and ensures that no null value is inserted into the primary key column(s), thereby maintaining the integrity of the database.

In relational database design, the primary key is crucial for establishing relationships between tables, as it’s often used as a reference in other tables (foreign keys) to link related data across the database.

Examples

Let’s see an example by considering Students table in a school database:

Table: Students

StudentIDPassportNumberLastNameFirstNameBirthDate
101AA123456SmithJohn2001-04-03
102BB987654DoeJane2002-05-15
103CC456789JohnsonEmily2003-06-18

Now, let’s define and give examples of a super key, a candidate key, and a primary key using this table.

1. Super Key

A super key is any combination of columns that can uniquely identify rows in a table. It could be a single column or a combination of multiple columns.

Examples:

  • StudentID alone can uniquely identify each student, so it’s a super key.
  • PassportNumber is also unique for each student, making it another super key.
  • A combination like LastName + FirstName + BirthDate could also uniquely identify a student (though in a larger, more diverse dataset, this might not hold true), making it a super key as well.
  • Essentially, any combination that includes StudentID or PassportNumber will be a super key because these columns provide uniqueness by themselves.

2. Candidate Key

A candidate key is a minimal super key, which means it has no unnecessary attributes; every part of it is needed to ensure uniqueness.

Examples:

  • StudentID is a candidate key because it is minimal (it consists of a single column) and uniquely identifies each student.
  • PassportNumber is another example of a candidate key for the same reasons as StudentID.

3. Primary Key

A primary key is a candidate key that has been chosen to uniquely identify rows in a table. It cannot contain NULL values, and each value must be unique.

Example:

  • If we choose StudentID as the primary key, this means that StudentID will be used to uniquely identify each row in the Students table. It must be unique for each student and cannot be NULL.

In summary, the StudentID and PassportNumber are both candidate keys because they can independently ensure the uniqueness of each row in the Students table. However, only one of them, StudentID in this example, is selected as the primary key to uniquely identify students within the database. The primary key is a specific type of candidate key that is chosen by the database designer based on factors such as simplicity, readability, or the nature of the data.