Let's summarize what you’ve learned up to this point. A database is like a warehouse. Its purpose is to feed information to a computer program like a warehouse containing raw materials feeds a clothing manufacturing factory. Between the warehouse and the factory, a DBMS manages the warehouse.
As you’ve seen, data can sometimes be very structured — or not. This course shows how to model a database for structured data (data about filming locations), which means we will use a relational database.
But what does modeling mean?
When organizing a warehouse, the first thing you do is install the racks, like enormous shelving units that will store the raw materials. But you can’t just place these racks anywhere. Before you can store the materials, you need to understand the attributes of the items you are putting on the racks:
Will the fabric be stored in rolls or boxes?
How much does a roll of fabric weigh?
What are its dimensions?
All this information is crucial when choosing an appropriate shelving size capable of holding the weight.
Designing the shelving layout is like modeling your warehouse, and this process starts with a pen and paper (well, nowadays, you can draw on your computer, of course! ).
There would be several steps to modeling a large warehouse:
Sketch out your design and show it to your team members for their opinion. At this point, you need to keep it high-level: your design must be understood by everyone, so nothing too technical.
Add more detail to your model.
Identify the technical attributes for each shelving unit, i.e., exact dimensions, location, material (wood or metal?), etc.
The same principle applies to modeling a database, and it typically involves three steps:
An easily understood general diagram of your data structure.
A set of tables showing the data.
The final step will be more technical and specific to your chosen RDBMS.
1. Create Your Conceptual Data Model (CDM)
The first step is the conceptual data model (CDM), which has three sub-steps:
1. Identify the entities present within the data.
For example, filming location data have the following entities:
Films
Films Directors
Filming locations
Production companies
2. Identify the relationships between these entities.
A production company PRODUCES a film.
A director DIRECTS a film.
A scene is FILMED in a filming location.
3. Assign a multiplicity to these relationships, using 0
, 1
, 1..*
, 0..*,
, etc.
There are several options when designing the conceptual model. The most common are:
The entity-relationship (ER) model (based on the MERISE method).
The class diagram (part of the UML modeling technique).
These two modeling methods are similar in principle, although they use quite different diagramming conventions. For example, an ER model looks like this:
Whereas a UML class diagram looks like this:
Do I need to use both?
No, you can just choose one.
2. Design Your Logical Data Model (LDM)
The conceptual model is a set of entities. But now, these entities need some structure.
Remember that structured data can be represented in the form of tables. Well, that’s the next step: formatting all of the entities into tables.
You are going to build the logical data model (LDM). This time, you don't have a choice between different modeling techniques like for the CDM. So you’re going to use the relational model.
The main process you use to get from UML to a relational model is transforming multiplicities into foreign keys. Does the idea of a foreign key seem a bit “foreign” to you? Don’t worry — you’ll get it. 🙂
Now you’re getting closer to the technical solution because you’re starting to shape the data so that an RDBMS can use it.
Here’s an example of a relational model:
Can’t we do the conceptual model? The LDM will be a lot of work!
Don't worry! The LDM flows quite naturally from the CDM. So you don’t need to think too much; you need to apply the basic transformation rules to get from the CDM to the LDM.
3. Determine Your Physical Data Model (PDM)
You're going to need to talk to the machine that will hold your database. As you know, if you want to talk to a computer, you need to use code. Yay!
So, the final step is to translate the relational model into code your chosen RDBMS can understand (e.g., PostgreSQL, MySQL, Oracle, SQLite, etc.).
Even if the RDBMS understands SQL, there are sometimes differences as each RDBMS has its own variant of the SQL language.
SQL can be broken down into four languages:
The language used to create data definitions (DDL), i.e., to create the structure of your database.
The language used to manipulate the data (DML), i.e., to add, update, delete, or retrieve data.
The language used to control the data (DCL), i.e., to manage access rights to the data.
The language used to control transactions (TCL), i.e., to confirm or cancel data updates.
Here’s an example of DDL (creating a table in SQL):
CREATE TABLE film
(
film_id INT PRIMARY KEY,
title VARCHAR(100),
genre VARCHAR(100),
release_year DATE
)
We're only going to look at the first two levels: the CDM and the LDM. To create the PDM, you need to use SQL, select an RDBMS, and install it. We are not going this far, so we can keep the course short. But you can use SQL in our other course, Implement a Relational Database With SQL.
It’s going to be frustrating to design the CDM and LDM schemas without manipulating the filming location data!
Yes, okay. We will have a chapter that will allow you to structure your data and use it to populate the SQLite RDBMS (as this is the simplest one to use). In addition, we’ll supply you with SQL code that you can copy and paste.
Now it's Your Turn!
Right, let's do a little exercise. Use the conceptual model for a database in the form of a UML class diagram provided below.
Work out what type of software could use this database.
2. Now go a bit further and imagine what features this software might have and how it can be used.
Answer:
It’s a database used by an order and invoice management system (e.g., a shop).
The software stores details of all items for sale, each belonging to a category. It also allows orders to be generated, which can relate to one or more customers and a delivery address.
When the order ships, it is recorded automatically in the software, and there is an invoice attached.
Let’s Recap!
Reference table for structured databases:
Modeling level | Abbreviation | Description | Illustration method |
Conceptual level | CDM | Diagram that can be understood by a human | ER Model (MERISE) or Class diagram (UML) |
Logical level | LDM | Translation of the CDM into a relational structure common to all RDBMS | Relational model |
Physical level | PDM | Very specific to the selected RDBMS, often written in SQL | SQL – DDL (data definition language) |
Now you know what database modeling means. But you need to understand that all models are not equal. In the next chapter, you’ll discover the necessary fundamental criteria for a high-quality model.