I haven’t told you everything about the relational model yet. In fact, I left out something important:
Generally, this constraint shouldn't pose a problem, even if we want to represent two identical objects. In our apple example, it’s quite possible for two different apples to have exactly the same characteristics:
Identifier | Weight | Diameter | Color |
1 | 151 g | 8.3 cm | red |
2 | 169 g | 9.1 cm | yellow |
3 | 134 g | 8.0 cm | yellow |
4 | 134 g | 8.0 cm | yellow |
Here, apples 3 and 4 share the same characteristics, but the corresponding rows are not identical, because the value of their identifier attribute is different: 3
and 4
.
This brings us to how to identify the tuples of a relation. And that’s where keys come in!
Understand the importance of keys
In your everyday life, you use keys to open the doors and gain access to your house. An essential characteristic of a key is its uniqueness. Indeed, if your front door key is identical to that of your neighbor’s, then you might have a problem...
In the relational model, the purpose of a key is also to provide access—to a tuple, and thereby to identify this tuple.
In our apple example, the identifier attribute is a key. If I ask you:
Can you please give me the apple whose identifier is "2"?
… you know exactly which row to give me, because each apple’s identifier is unique!
However, if I ask you:
Give me the apple that weighs 169 grams!
… you could give me apple number 2, but if I were to add another apple to my relation that also weighs 169 grams, you would no longer know which apple to give me. The weight attribute is therefore not a key.
Identify a key
Let’s hone our skills at spotting keys with a relation we’ll call vehicle, representing vehicles owned by people of French nationality (this will become important later) :
registration | country_registration | color | number_seats | owner | own_num_secu | own_num_vehicule |
2983-AA | France | blue | 5 | Luke | 1134212 | 1 |
1923-BD | France | green | 7 | Lucia | 2021726 | 1 |
PLT-28-190 | Portugal | blue | 7 | John | 1012612 | 1 |
2383-ZN | France | black | 9 | Leïla | 2125312 | 2 |
1209-NQ | France | black | 5 | Leïla | 2125312 | 1 |
2634-OS | France | blue | 5 | John | 1162732 | 1 |
Is the [ registration ] attribute group a key for our relation?
No! Even though in France vehicle registration numbers are unique, our relation is obviously meant to contain vehicles from more than one country. It is possible for another country to use (now or in the future) the same registration system as France. In that case, there could be two vehicles in the world that have the same registration number!
Is the [ registration, country_registration ] attribute set a key for our relation?
Yes! Provided that a registration number cannot be reassigned to another vehicle after it is destroyed.
What about the attribute set [ registration, country_registration, color ] ?
Yes and No. As we just saw, the [ registration , country_registration ] attribute set is already a key. This implies that two rows cannot have the same value for their registration and country_registration attributes.
As a result, two rows cannot have the same values for the above three attributes either!
However, it is important to remember the definition of a key, as defined above: a key has to be the minimum set of attributes. Here, our set of three attributes is not the minimum. If we remove the color attribute, then we are left with [ registration, country_registration ], which is already a key (as we saw in the previous example—I hope you were following along ).
Color is therefore not needed here.
What about the attribute set [ own_num_secu ]?
No … well, okay, if we had a relation that represented people of French nationality, the social security number would be a key. But that’s not the case: our relation represents vehicles, not people. Two vehicles can have the same owner, and can therefore have the same value for the own_num_secu attribute.
How about the set [ own_num_secu, own_num_vehicle ]?
Yes, this is a key. If two vehicles have the same owner identified by own_num_secu, they can’t have the same owner AND the same own_num_vehicle at the same time.
Summary
A key is a set of attributes (= a group of columns).
For a set of attributes to be a key, two tuples (= two rows) must never have identical values for their attributes.
The set of attributes must be the minimum—that is, if any one of the attributes is removed from the set, it is no longer a complete key.