• 20 hours
  • Medium

Free online content available in this course.

course.header.alt.is_video

course.header.alt.is_certifying

Got it!

Last updated on 8/29/24

Link relations using joins

Now we come to a fundamental concept of relational algebra that you use all the time: the Join.

Let’s go back to the concept of foreign keys, which are used to link relations. We know that the relations are linked, but we don’t yet know how to use these links. That’s what we’re going to see now!   :D

Let’s return to the example of our two relations apple and variety.

apple

identifier

weight

diameter

color

name_variety

1

151 g

8.3 cm

Red

Red Delicious

2

169 g

9.1 cm

Green

McIntosh

3

134 g

8.0 cm

Green

McIntosh

variety

label

price_per_kilo

maturity

taste

Red Delicious

3.19

 late Sept.

 Sweet

Macintosh

3.49

 mid. Sept.

 Tart

In the chapter Create Links between Your Relations using Foreign Keys we wanted to know the price per kilo of a given apple, and so we wrote the following sentence:

If I want to know the price per kilo of a given apple, I must first know its variety. Then I’ll need to locate the corresponding row in the variety table in order to know its price.

Link relations using Inner Joins

More formally, we say that we perform an inner join of the apple relation and the variety relation according to the condition  apple.name_variety = variety.label  .

And yes, we must always specify the join condition. Here, the condition states that the name_variety column of apple references the label column of variety

Here are the results:

Result of the join

apple.
identifier

apple.
weight

apple.
diameter

apple.
color

apple.
name_variety

variety.
label

variety.
price_per_kilo

variety.
maturity

variety.
taste 

1

151 g

8.3 cm

red

Red Delicious

Red Delicious

3.19

late Sept.

Sweet

2

169 g

9.1 cm

green

McIntosh

McIntosh

3.49

mid. Sept.

Tart

3

134 g

8.0 cm

green

McIntosh

McIntosh

3.49

mid. Sept.

Tart

We have achieved our objective. Knowing my apple’s identifier gives me direct access to its price per kilo: it’s on the same row!

Joining multiple columns

You will recall that a primary key can consist of multiple columns. You should also know that we can join more than one column (let’s say two columns, for example).

This being the case, a foreign key that references this primary key will necessarily consist of two columns. The join condition will be as follows:

relation1.attribute1 = relation2.attribute3 AND relation1.attribute2 = relation2.attribute4

Relation 1

id [PK]

attribute1

attribute2

[...]

0

398

AX

[...]

[...]

[...]

[...]

[...]

Relation 2

attribute 3 [PK]

attribute 4 [PK]

[...]

398

AX

[...]

[...]

[...]

[...]

Link relations using outer joins

What happens if we don’t know the variety of apple 3? In this case, the foreign key will contain a null value, and will, therefore, have no corresponding value in the variety table.

Apple

Identifier

weight

diameter

color

name_variety

1

151g

8.3 cm

red

Red Delicious

2

169g

9.1 cm

green

Braeburn

3

134g

8.0 cm

green

NULL

Our previous join condition  apple.name_variety = variety.label  will no longer be met, because the  apple.name_variety  cell is null.

Left outer joins

If we also want to retain apples whose varieties are not known to us, we have to apply an outer join. If the apple table is on the left, and the variety table is on the right, we perform a left outer join, because we are retaining all of the rows of the table on the left: 

Result of a left outer join

apple.
Identifier

apple.
Weight

apple.
diameter

apple.
Color

apple.
name_variety

variety.
label

variety.
price_per_kilo

variety.
Maturity

variety.
Taste

1

151g

8.3 cm

red

Red Delicious

Red delicious

3.19

late Sept.

Sweet

2

151g

9.1 cm

green

Braeburn

Braeburn

3.49

mid. Sept.

Tart

3

134g

8.0 cm

green

NULL

NULL

NULL

NULL

NULL

Right outer joins

In the same way, if there were no apples of the Gala variety, but we still wanted to retain the information about this variety after the join, we would need to apply a right outer join. This way, we would retain all of the rows of the table on the right (variety), even if some of the varieties had no corresponding value in the apple table:

Result of a right outer join

apple.
identifier

apple.
Weight

apple.
Diameter

apple.
Color

apple.
name_variety

variety.
label

variety.
price_per_kilo

variety.
Maturity

variety.
Taste

1

151 g

8.3 cm

Red 

Red Delicious

Red Delicious

3.19

late Sept.

Sweet

2

169 g

9.1 cm

Green

Braeburn

Braeburn

3.49

mid. Oct.

Sweet/Tart

NULL

NULL

NULL

NULL

NULL

Gala

3.19

mid. Sept.

Sweet

NULL

NULL

NULL

NULL

NULL

McIntosh

2.99

mid. Sept.

Tart

Notice that the results of the right outer join retain all the records from the right table (variety) and all matching records from the left table (apple).

Full joins

And if we want to retain all of the information in both apple and variety, I have a third type of outer join on hand: the full join! Here’s what it looks like:

Results of a full outer join

apple.
identifier

apple.
weight

apple.
diameter

apple.
color

apple.
name_variety

variety.
label

variety.
price_per_kilo

variety.
maturity 

Variety.
Taste

1

151 g

8.3 cm

red

Red Delicious

Red Delicious

3.19

late September/early October

Sweet

2

169 g

9.1 cm

green

Braeburn

Braeburn

3.49

mid September

Sweet/Tart

3

134 g

8.0 cm

green

NULL

NULL

NULL

NULL

NULL

NULL

NULL

NULL

NULL

NULL

Gala

3.19

mid September

Sweet

NULL

NULL

NULL

NULL

NULL

Macintosh

2.99

mid September

Tart

Notations

Here is the notation I recommend for the inner join:

Join (apple, variety, apple.name_variety = variety.label)

For right, left, and full outer joins:

JoinLeft (apple, variety, apple.name_variety = variety.label)

JoinRight (apple, variety, apple.name_variety = variety.label)

JoinFull (apple, variety, apple.name_variety = variety.label)

Go Further: Natural Joins

And now, a little trick that’s actually very useful: the natural join.

A natural join is a regular join. If the two relations being joined have columns that appear in both relations (exactly the same name in both tables), the join can be implicit.

Thus, if I change the name of the “label” column in the variety table to name_variety, both tables will have this column in common. If we want to join all of the columns that have the same name in both relations (here there is only one such column), we can use the natural join.

With the natural join, we no longer need to specify a condition, because it is implicit—the columns have the same name!

So...

Join (apple, variety, apple.name_variety = variety.name_variety)

is equivalent to

JoinNatural (apple, variety).

Summary

  • A join "combines" two tables based on a specified condition

  • An inner join doesn’t necessarily retain all of the information of the two tables being joined. It only includes matching rows between tables.

  • In order to retain all of the information of one (or both) of the tables, you need to apply an outer join.

Example of certificate of achievement
Example of certificate of achievement