Now let’s move on to SQL, the Structured Query Language created to communicate with relational databases.
A Brief Introduction to SQL
SQL (Structured Query Language) offers three command categories:
DDL (Data Definition Language), used to create or delete objects in a database (tables, constraints, etc.).
DCL (Data Control Language), used to manage users and control their access rights (view, modify, etc.).
DML (Data Manipulation Language), used to manipulate the data contained in the table—that is, to manipulate the rows of the tables. The four available operations are:
Insert rows
Read rows
Update rows
Delete rows
A brief introduction to RDBMSes
A Database Management System (DBMS) is a software program.
This software program is used to manipulate databases in which information is stored.
When these databases adhere to the rules of the relational model, they are called relational databases, and the DBMS that manipulates them becomes a Relational Database Management System, or RDBMS. Which is only logical.
Where does SQL fit in to all this?
If your DBMS is an RDBMS, it is highly likely you will be using SQL in order to communicate with it!
The best known RDBMSes are MySQL (and its more recent counterpart, MariaDB), PostgreSQL, Microsoft Access, Oracle Database, and SQLite.
Download the database
The database we will be using for the rest of this course is online. You can query it using the interactive console provided at the end of each of the following chapters.
However, if you want to download it, you can find it here, in sqlite3 format. You can open it with any software that works with this format, for example SQLiteStudio. You’ll find a short user’s guide in the video. ;)
Understand the database
In the chapters that follow, we’ll be conducting an investigation using the Panama Papers Offshore Leaks Database!
In April of 2016, German newspaper Süddeutsche Zeitung and the International Consortium of Investigative Journalists (ICIJ) published confidential documents leaked from the database of a Panama-based law firm.
The publication of these documents made headlines worldwide, because the documents contained information concerning over 214,000 offshore companies including the names of its shareholders. This was known as the Panama Papers.
The Panama Papers consist of nearly 11.5 million documents (emails, correspondence, contracts, etc.). The ICIJ indexed this massive cache of information, then extracted essential information using algorithms. The result was placed in a database that has been made public.
The database contains four main tables.
The entity table. This is the table containing offshore companies.
The intermediary table, containing intermediaries.
The address table, containing the addresses of some of the companies and intermediaries.
The officer table, which contains, among other things, the beneficial owners of companies.
Our Investigation
And now, I invite you to put on the hat of an investigator looking into the finances of a criminal network. :pirate:
Imagine that, in the course of your investigation, you intercept an invoice issued by a mysterious company called Big Data Crunchers Ltd..
On this invoice, the address of the company is missing. You don’t know who is behind this company, but you think it might be a shell corporation, or in other words, a corporation that exists only on paper and has no office or employees. A shell corporation is not easy to create. Generally, you need the help of specialized service providers, which we will refer to as intermediaries.
So now you are going to investigate this mysterious company and the intermediaries who assisted in its creation, because you believe that it might be possible to charge them with involvement in a crime.
Ready, let's go!
Summary
In this course, we will be looking solely at the DML (Data Manipulation Language).
SQL is used to communicate with relational databases.
An RDBMS is a software program used to manipulate relational databases.
For the following chapters, we will be working with The Panama Papers database, which contains the names of offshore companies.
There are four main tables in the database: entity, intermediary, address, and officer.
You are going to investigate the mysterious company Big Data Crunchers Ltd., and the intermediaries who assisted in creating this company.