In this second part of the course, you’ll see the first of the three steps involved in modeling a relational database. The first step is the conceptual data model (CDM).
As the name suggests, you are going to identify the concepts (or entities) that exist in your data and identify the relationships between them.
Now it's Your Turn!
Using the data about filming locations (posted below), pull out the main entities and their attributes.
How do we show this on paper?
Create little rectangular boxes and divide them into three sections. Write the entity’s name in the top section, then the list of attributes in the center. You’ll learn about the bottom section later in the chapter.
Entity |
+attribute1 +attribute2 ... |
|
All done?
Well done! There’s no right or wrong answer in this exercise. So it’s not a problem if your answer is different. Of course, one way of modeling might be better than another, but with experience, it will start to come naturally.
Here’s a possible answer:
I've deliberately ignored the “location ID” column, as I’ll introduce the idea of an identifier later.
The
ProductionCompany
column uses the attributename
within theProduction Company
entity.The
Director
column uses the attributename
within theDirector
entity.
Formalize Your Entities Into Classes
You have just drawn a UML class diagram formalizing all entities into classes (i.e., Film
, Director
, etc.).
A box represents each class in this diagram, with its name in the top section.
A class is a bit like a mold, giving a general shape to each instance based on the class. For example, if you have a cookie-cutter, you can make as many cookies as you want. The cookie-cutter is the class, and the cookies are instances.
In the same way, the Film
class defines the general shape for all instances of this class. So, for example, Shang-Chi and the Legend of the Ten Rings is one instance of the Film
class, as is Mrs. Doubtfire or Tales of the City.
Define Your Classes With Attributes and Methods
Attributes
Define the attributes (a characteristic of a class) in the center section of the box.
An attribute must have a particular type so that you know in advance what type of values it can contain:
Numeric values (whole numbers, decimals, etc.)
Text
Dates
Etc.
To define a domain, you can either create a list of all possible values (enumerated types) or define the characteristics of the values, such as “it must be a whole number (primitive types).”
For example:
Enumerated type:
Attribute domain for
filmingType
:[FeatureFilm, TVSeries, etc.]
Primitive type:
Domain for
releaseYear
: whole number between 0 and 9999.
You can define composite attributes, including a predefined number of values that don’t all have to be of the same type. For example, an address
attribute usually consists of:
Property number (number greater than zero)
Street name (text)
Town/city name (text)
Zip Code (5-digit number)
You can also define multivalued attributes, which means that the attribute can have zero, one, or more values of the same type (or the same domain) without knowing in advance how many there will be. An example of this could be the telephone number(s)
attribute belonging to the Person
class because one person could have zero, one, or many telephone numbers.
Finally, there are derived attributes, where you need to use other attributes to determine the value. In your case, filmDecade
can be derived from releaseYear
.
This is why you don’t want to store the actual value of a derived attribute but instead store its calculation method.
But wait, a calculation method is coded in a programming language, isn't it?
Exactly! The third section of the boxes is for holding methods.
Methods
Methods exist either within the application’s code that is using the DBMS or in the DBMS itself (where possible). We won’t go over this in this course as coding methods within a DBMS depend on the DBMS you're using and is too specific to cover.
A method is a series of operations that can:
Update attributes.
Calculate a result using attributes.
Do both of the above simultaneously.
In fact, the methods define the behavior of your class (i.e., the actions that the class can carry out).
Therefore, you can create a method called determineFilmDecade()
for the exercise, which will use the releaseYear
to derive the filming year.
Film |
+releaseYear: Numeric |
+determineFilmDecade(): Numeric |
Communicate With Software Developers Using UML
Communicate Using Class Diagrams
Have you ever heard of object-oriented programming?
In an object-oriented programming language, the developer defines the objects. An object is an entity, idea, or item in real life. Every object has attributes and a set of possible behaviors.
Yes, but we’re not writing software; we’re modeling a database!
True, but the description is very close to what you just saw (i.e., classes described using attributes with methods defining their behavior).
Developers who code using object-oriented languages, such as Python, PHP, C++, Java, etc., also use the UML class diagram. And it's much easier that way.
Using a UML class diagram for modeling a database has become much more common with the growth in object-oriented programming.
Given that a database is almost always used in conjunction with a computer program, data is exchanged between the two. Therefore, if these two system parts use the same entities, it is easier for the database administrator and the programmer to communicate. This is why both sides of the equation use the UML class diagram (database and application).
Some object-oriented languages even provide features (known as libraries) that allow the relational database structure to automatically generate based on the program code written in that language!
For example, imagine that NeoNomad is coded in Python. Then, somewhere within the Python code, you’ll find the class definition for Film
written in the following way:
class Film(DataBase):
[...]
releaseYear = Column(Numeric)
[...]
def determineFilmDecade(self):
return decade = self.releaseYear
Distinguish the Class Diagram From Other UML Diagrams
The class diagram you created is ideal for your problem because it describes your future database’s structure. In addition, the structure is static and doesn't change over time. A database structure only changes when the software upgrades to a new version.
Therefore, you only need a UML class diagram to create a database. It’s also helpful for the developers who will refer to the classes in their program, even though static information doesn't cover everything they need.
The UML language also has other diagrams that allow you to display a program’s behavior, such as the activity diagram (which illustrates a chain of events relating to the different program components) or the timing diagram (which shows how data evolves), etc. There are 14 in total!
Let’s Recap!
Vocabulary | Meaning | Example(s) |
Class | An entity, an object | A film |
Instance | One specific example of an entity | The film Mrs. Doubtfire |
Attribute | A class characteristic | The film’s title |
Type (attribute domain) | Type of value (or set of values) that an attribute can take | “Number,” “text,” “number between 0 and 100,” “One of three possible values: Male, Female, Non-binary,” etc. |
Composite attribute | An attribute consisting of a predefined number of values from different domains | Address (number, street, town, etc.) |
Multivalued attribute | An attribute consisting of a variable number of values from the same domain | Telephone number(s) |
Method | A sequence of operations to calculate a result that uses or modifies attributes | Method |
Representing a class in UML:
ClassName |
attributes |
methods() |
Now that you’ve identified all of the classes in your model, you need to create relationships between them.