The Status of Our Investigation
We have just found our mysterious company Big Data Crunchers Ltd. in the entity table. Next step: locate its address!
Inner Joins
Our objective is to "combine" the entity and intermediary tables to obtain a table that includes each company and its address. In SQL, there are two main ways of writing joins:
The first method: with FROM and WHERE
Remember the chapter Link Relations using Joins? In the section Another Way of Looking at Joins I explained that a join is equivalent to a Cartesian product followed by a restriction.
So I begin by obtaining the Cartesian product of the two tables...
SELECT * FROM entity, address ;
... then I add the restriction condition:
SELECT * FROM entity, address WHERE entity.id_address = address.id_address ;
And there you have it! Our join is made!
The second method: with JOIN and ON
The second method introduces the keywords JOIN
and ON
:
SELECT * FROM entity JOIN address ON entity.id_address = address.id_address ;
Join Multiple Columns
If your foreign key contains two attributes (or more), you must use AND
. Take the example of a table t1
that has a foreign key with two attributes, fk1
and fk2
. This foreign key references table t2
, whose primary key is also made up of two 2 attributes, pk1
and pk2
:
-- First method:
SELECT * FROM t1, t2 WHERE (t1.fk1 = t2.pk1 AND t1.fk2 = t2.pk2);
-- Second method:
SELECT * FROM t1 JOIN t2 ON (t1.fk1 = t2.pk1 AND t1.fk2 = t2.pk2);
Well played!
We have achieved our goal of combining our two tables. We must now find Big Data CrunchersLtd. in the result. To do this, we need a little restriction. We add it in the WHERE
clause. Since there is already a condition, we use AND
:
SELECT * FROM entity, address WHERE entity.id_address = address.id_address AND entity.name = 'Big Data Crunchers Ltd.';
Join an Association Table
Now let’s find the intermediaries who helped create the company Big Data Crunchers Ltd.!
Previously, we saw that association tables are used to model many-to-many cardinalities between two objects. Entity and intermediary have a many-to-many relationship, because many intermediaries can create one company, and one intermediary can create many companies. Therefore, we have a table with the name assoc_inter_entity, containing, among other things:
an entity column, a foreign key referencing the entity table,
an intermediary column, a foreign key referencing the intermediary table.
So now we need to join three tables, as follows:
SELECT
i.id as intermediary_id,
i.name as intermediary_name,
e.id as entity_id,
e.name as entity_name,
e.status as entity_status
FROM
intermediary i,
assoc_inter_entity a,
entity e
WHERE
a.entity = e.id
AND a.inter = i.id
AND e.name = 'Big Data Crunchers Ltd.' ;
We specify the three tables in the FROM
clause. In the WHERE
clause, we place the two join conditions, along with the name of our company.
Go Further: Outer Joins
Remember outer joins? They are used to retain rows that would have been left out if you were using an inner join. If we want to obtain a table containing all of the companies of entity and their addresses without losing the companies whose addresses are not known, we need a left outer join.
SELECT *
FROM entity
LEFT OUTER JOIN address ON entity.id_address = address.id_address;
If, on the other hand, we want to retain all of the addresses (whether they are associated with companies or not), we need a right outer join:
SELECT *
FROM entity
RIGHT OUTER JOIN address ON entity.id_address = address.id_address;
You see where I’m going with this? Yes! All that remains now is to give you the syntax for the full join! Which will use to retain all of the addresses and all of the companies:
SELECT *
FROM entity
FULL OUTER JOIN address ON entity.id_address = address.id_address;
Go Further: Natural Joins
In the chapter Link Relations using Joins, I explained the concept of the natural join. Here’s how to implement this concept in SQL:
SELECT * FROM entity NATURAL JOIN address ;
Summary
There are two methods for joining:
with
FROM
andWHERE
from
JOIN
andON
When we’re dealing with an association table, we have to join three tables!
To simplify queries, we can use aliases.