• 20 hours
  • Medium

Free online content available in this course.

course.header.alt.is_video

course.header.alt.is_certifying

Got it!

Last updated on 8/29/24

Nest queries using IN, ALL, ANY, and EXISTS

Do you know that it is possible to nest queries inside other queries? I assume you do—because we mentioned it fleetingly earlier in this course!

Let’s look at an example. In the chapter on HAVING clauses, I explained that the HAVING keyword is used to apply a restriction to a table after aggregation. But if HAVING seems too complicated, you can use a nested query (aka subquery) instead:

-- QUERY n°1
SELECT *
FROM (SELECT id_address, count(*) AS cnt FROM entity GROUP BY id_address) a
WHERE a.cnt > 500 ;

-- QUERY n°2, same as query n°1:
SELECT id_address, count(*) AS cnt 
FROM entity 
GROUP BY id_address 
HAVING count(*) > 500 ;

What do these queries return? The addresses of over 500 companies.

The IN Keyword

These two queries only return the address identifier, which is a whole number. How can we obtain the address in a more readable form?

We could create a join with the address table, but we could also use the  IN  operator, whereby we would request the following:

Give me the rows of the address table in which the identifier is found IN the id_address column of the table obtained by query no. 2.

Once again, we find ourselves with a nested query, or sub-query, using the results of query no. 2 (above on this page):

SELECT *
FROM address a
WHERE a.id_address IN (
    SELECT id_address
    FROM entity
    GROUP BY id_address
    HAVING count(*) > 500
    );

The ANY and ALL Keywords

Are there intermediaries that are more active than the two intermediaries we found earlier (Bargain Banking S.A. and Plouf Financial Services Corp.)?

First, we have created a nb_entities table, specifying for each intermediary the number of companies it assisted in creating:

nb_entities

id_intermediary

intermediary_name

cnt_entities

 5000

Pacher Banking S.A.

 2184

 5001

Plouf Financial Services Corp.

 720

[...]

[...]

[...]

To answer our question, we need to use an operator followed by  ALL  or  ANY:

SELECT * FROM nb_entities WHERE cnt_entities > ALL(SELECT cnt_entities FROM nb_entities WHERE intermediary_id IN (5000,5001,5002));

The EXISTS Keyword

Let’s go back to the query we saw above, in the section on the  IN  operator, and put ourselves in the shoes of the RDBMS that is being asked to execute this query:

CREATE TABLE nb_entities AS
SELECT i.id AS intermediary_id, 
    i.name AS intermediary_name, 
    e.jurisdiction,
    e.jurisdiction_description,
    count(*) AS cnt_entities
FROM intermediary i, assoc_inter_entity a, entity e
WHERE a.entity = e.id AND a.inter = i.id
GROUP BY i.id, i.name, e.jurisdiction, e.jurisdiction_description;

Normally, you are supposed to check whether the  WHERE  clause is true or false for each row of the address table, right? The condition will be tested as many times as there are rows.

So far so good. But here, in the  WHERE  clause, we find a nested subquery (from row 4 to row 7). Are you going to recalculate this subquery each time you check the  WHERE  condition, for every single address row??? That would be silly, because the result of this subquery is always the same. A better idea would be to store this result, rather than recalculate it every time!

Except... if this subquery depends on the row for which we are testing the  WHERE  clause, it is no longer possible to cache the result. :( And if you have to recalculate the subquery every time, it’s going to take forever!

If you find yourself in this situation, it is preferable (from a performance standpoint) to use the  EXISTS  operator, which tests whether a subquery contains at least one row. If there is at least one row, it stops executing the subquery, because it is satisfied! 

Go Further: Using IN on Multiple Columns

In this chapter, we checked whether a value was “in” a specified list of values.

It is also possible to check whether a  couple  of valuesvalue1, value2   is found in a list of couples  c1, c2, using the following approach:

SELECT * FROM t1 WHERE t1.valeur1, t1.valeur2 IN (SELECT c1, c2 FROM t2) ;

Go Further: Temporary Tables

Theoretically, there is no real limit to how many queries you can nest. It’s even possible to nest nested queries within other queries… and so on! :soleil:

In order to avoid long, drawn-out queries, know that it is sometimes possible (depending on your RDBMS) to create temporary tables. This involves creating new tables for storing the result of a query so you can use that result in another query.

For example, query no. 1 at the top of this page is equivalent to the following code:

CREATE TABLE nb_entities AS
SELECT i.id AS intermediary_id, 
    i.name AS intermediary_name, 
    e.jurisdiction,
    e.jurisdiction_description,
    count(*) AS cnt_entities
FROM intermediary i, assoc_inter_entity a, entity e
WHERE a.entity = e.id AND a.inter = i.id
GROUP BY i.id, i.name, e.jurisdiction, e.jurisdiction_description;

These temporary tables will be deleted when you close your RDBMS session.

Summary

  • To test whether a specified value is found in the result of a query, use  IN  .

  • Suppose B is a list of values returned by a query. To test whether a condition (for example  A < B  ) is true for...

    • ... all values of  B  , use  ALL  ;

    • ... at least one value of  B  , use  ANY  .

  • The  EXISTS  operator tests whether the subquery contains at least one row.

     

Ever considered an OpenClassrooms diploma?
  • Up to 100% of your training program funded
  • Flexible start date
  • Career-focused projects
  • Individual mentoring
Find the training program and funding option that suits you best
Example of certificate of achievement
Example of certificate of achievement