Last updated on 6/7/23

Now we come to a fundamental concept of relational algebra that you use all the time: the Join.

Let’s go back to the concept of foreign keys, which are used to link relations. We know that the relations are linked, but we don’t yet know how to use these links. That’s what we’re going to see now!

Let’s return to the example of our two relations apple and variety.

 identifier weight diameter color name_variety 1 151 g 8.3 cm Red Red Delicious 2 169 g 9.1 cm Green McIntosh 3 134 g 8.0 cm Green McIntosh
 label price_per_kilo maturity taste Red Delicious 3.19 late Sept. Sweet Macintosh 3.49 mid. Sept. Tart

In the chapter Create Links between Your Relations using Foreign Keys we wanted to know the price per kilo of a given apple, and so we wrote the following sentence:

If I want to know the price per kilo of a given apple, I must first know its variety. Then I’ll need to locate the corresponding row in the variety table in order to know its price.

More formally, we say that we perform an inner join of the apple relation and the variety relation according to the condition  `apple.name_variety = variety.label`  .

And yes, we must always specify the join condition. Here, the condition states that the name_variety column of apple references the label column of variety

Here are the results:

 apple.identifier apple.weight apple.diameter apple.color apple.name_variety variety.label variety.price_per_kilo variety.maturity variety.taste 1 151 g 8.3 cm red Red Delicious Red Delicious 3.19 late Sept. Sweet 2 169 g 9.1 cm green McIntosh McIntosh 3.49 mid. Sept. Tart 3 134 g 8.0 cm green McIntosh McIntosh 3.49 mid. Sept. Tart

We have achieved our objective. Knowing my apple’s identifier gives me direct access to its price per kilo: it’s on the same row!

##### Joining multiple columns

You will recall that a primary key can consist of multiple columns. You should also know that we can join more than one column (let’s say two columns, for example).

This being the case, a foreign key that references this primary key will necessarily consist of two columns. The join condition will be as follows:

`relation1.attribute1 = relation2.attribute3 AND relation1.attribute2 = relation2.attribute4`

 id [PK] attribute1 attribute2 [...] 0 398 AX [...] [...] [...] [...] [...]
 attribute 3 [PK] attribute 4 [PK] [...] 398 AX [...] [...] [...] [...]

What happens if we don’t know the variety of apple 3? In this case, the foreign key will contain a null value, and will, therefore, have no corresponding value in the variety table.

 Identifier weight diameter color name_variety 1 151g 8.3 cm red Red Delicious 2 169g 9.1 cm green Braeburn 3 134g 8.0 cm green NULL

Our previous join condition  `apple.name_variety = variety.label`  will no longer be met, because the  `apple.name_variety`  cell is null.

##### Left outer joins

If we also want to retain apples whose varieties are not known to us, we have to apply an outer join. If the apple table is on the left, and the variety table is on the right, we perform a left outer join, because we are retaining all of the rows of the table on the left:

 apple.Identifier apple.Weight apple.diameter apple.Color apple.name_variety variety.label variety.price_per_kilo variety.Maturity variety.Taste 1 151g 8.3 cm red Red Delicious Red delicious 3.19 late Sept. Sweet 2 151g 9.1 cm green Braeburn Braeburn 3.49 mid. Sept. Tart 3 134g 8.0 cm green NULL NULL NULL NULL NULL
##### Right outer joins

In the same way, if there were no apples of the Gala variety, but we still wanted to retain the information about this variety after the join, we would need to apply a right outer join. This way, we would retain all of the rows of the table on the right (variety), even if some of the varieties had no corresponding value in the apple table:

 apple.identifier apple.Weight apple.Diameter apple.Color apple.name_variety variety.label variety.price_per_kilo variety.Maturity variety.Taste 1 151 g 8.3 cm Red Red Delicious Red Delicious 3.19 late Sept. Sweet 2 169 g 9.1 cm Green Braeburn Braeburn 3.49 mid. Oct. Sweet/Tart NULL NULL NULL NULL NULL Gala 3.19 mid. Sept. Sweet NULL NULL NULL NULL NULL McIntosh 2.99 mid. Sept. Tart

Notice that the results of the right outer join retain all the records from the right table (variety) and all matching records from the left table (apple).

##### Full joins

And if we want to retain all of the information in both apple and variety, I have a third type of outer join on hand: the full join! Here’s what it looks like:

 apple.identifier apple.weight apple.diameter apple.color apple.name_variety variety.label variety.price_per_kilo variety.maturity Variety.Taste 1 151 g 8.3 cm red Red Delicious Red Delicious 3.19 late September/early October Sweet 2 169 g 9.1 cm green Braeburn Braeburn 3.49 mid September Sweet/Tart 3 134 g 8.0 cm green NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL Gala 3.19 mid September Sweet NULL NULL NULL NULL NULL Macintosh 2.99 mid September Tart

Here is the notation I recommend for the inner join:

`Join (apple, variety, apple.name_variety = variety.label)`

For right, left, and full outer joins:

`JoinLeft (apple, variety, apple.name_variety = variety.label)`

`JoinRight (apple, variety, apple.name_variety = variety.label)`

`JoinFull (apple, variety, apple.name_variety = variety.label)`

And now, a little trick that’s actually very useful: the natural join.

A natural join is a regular join. If the two relations being joined have columns that appear in both relations (exactly the same name in both tables), the join can be implicit.

Thus, if I change the name of the “label” column in the variety table to name_variety, both tables will have this column in common. If we want to join all of the columns that have the same name in both relations (here there is only one such column), we can use the natural join.

With the natural join, we no longer need to specify a condition, because it is implicit—the columns have the same name!

So...

`Join (apple, variety, apple.name_variety = variety.name_variety)`

is equivalent to

`JoinNatural (apple, variety)`.

• A join "combines" two tables based on a specified condition

• An inner join doesn’t necessarily retain all of the information of the two tables being joined. It only includes matching rows between tables.

• In order to retain all of the information of one (or both) of the tables, you need to apply an outer join.

Ever considered an OpenClassrooms diploma?
• Up to 100% of your training program funded
• Flexible start date
• Career-focused projects
• Individual mentoring
Find the training program and funding option that suits you best