• 6 hours
  • Medium

Free online content available in this course.

course.header.alt.is_video

course.header.alt.is_certifying

Got it!

Last updated on 4/15/20

Explore Additional Possibilities for Database Operations

We have had a fast spin through a range of persistence mechanisms, covering each one only briefly, so that you can see how the repository pattern can be useful to manage that range of options.

Each mechanism comes with its own much deeper skill set, and you, of course, must choose how much attention to pay to learning which skills as you develop as a software engineer. This chapter introduces some key topics to explore further.

Working with Multiple Users

Identifying the Challenge

Let's say we are both editing a customer's details: you are changing the email, and I am changing the telephone number. We both read the customer's details and bring them up onto our application screen. You change the email and save it. However, I still have the old email on my screen - and when I change the telephone and save it, the old email on my system overwrites your update on the database.

This is a simple example of the problem of having many different people viewing, editing, and updating a shared dataset.

Exploring Solutions

The relevant terms to explore include multi-user databases, or, more generally, multi-user concurrency problems. Record locking is probably the most common way of managing concurrency, by essentially locking out data when a user is editing it so that other users cannot work on it. Another is read-before-write, where the database checks if anyone else has changed anything while you have been editing it. A similar option is to timestamp all changes so you can see if anything was updated since you retrieved your copy.  

For more general database principles, you can look at the acronym ACID (atomic, consistent, isolated, durable) - the isolated aspect is particularly relevant to handling multiple users.

Using Complex Objects

Identifying the Challenge

In the course examples, we only stored a single object as a single operation. If there is a problem, the object is either stored or not stored. An object would only be partly stored under very unusual circumstances.

Often you want to store compound or aggregate objects. For example, you might have an address object that has fields such as street and postcode. A person might have several address objects - for work, residence, or previous residence, among others. An address might be associated with several persons (for example, customer contacts at a particular company).

When it comes to storing these in a relational database, the person is stored in a different table to the address, but must still be associated with a relationship link. You have to think about which one is stored first (in case there are problems with storing). You also have to decide what happens if one is stored or updated, but the other fails, and when other people are reading data while your updates are taking place. 😱

You can see how complex this can get just by considering what an order object might look like that keeps customer orders. It would need to refer to a person object as the customer, to the list of robot parts ordered, and their quantities, a delivery address, etc.

Exploring Solutions

You might consider exploring atomic operations and the unit of work pattern. These are about how you bundle data persistence actions to ensure that your data is consistent, even if the operations fail.

An atomic operation is one that cannot be subdivided. The operation either succeeds or fails; it does not partially succeed. For example, you may want to ensure that storing a person and their address is an atomic operation - that if something goes wrong, you don't end up with an address stored but no corresponding person, for example.

A unit of work is a way of describing what database actions should be bundled up to be atomic. For instance, you saw with the Hibernate library that you have to call the transaction's commit method when you have finished specifying the actions (Line 4):

Session session = sessionFactory.openSession();
session.beginTransaction();
session.save( aCustomer ) ;
session.getTransaction().commit();
session.close();

The actions between beingTransaction and the commit are bundled into a unit of work and the database applies them all as an atomic operation.

Integrating Security

Identifying the Challenge

Security is commonly broken down into three areas, given by the acronym CIA:

  • Confidential: For example, only people who need to know about customer details or activity should have access to them. Most countries have laws about this that you must comply with.

  • Integrity: For instance, the robot parts catalog should only be editable by specific staff so that it is not corrupted either deliberately or by accident.

  • Availability: The catalog should be available online - for viewing and ordering - and protected from attacks that prevent it from running.

These are highly relevant to the persistence layer, because long-lived data is vulnerable to theft (loss of confidentiality), corruption (loss of integrity), and deletion (loss of availability).

Exploring Solutions

You already know how to deal with this: require staff to log in to their individual user account, and then give that account specific access rights. Some accounts (users) can view some data but not others, and some can edit some data, but not others.

You would usually organize users into groups, for example, the group that can view customers, the group that can edit their details, and the group that can edit the robot parts catalog.

The data is typically separate from the application (in a shared relational database) and maintains user accounts for the overall system (logging into windows), the application (logging into the application), and the data store (logging into the relational database) can be time-consuming. You could explore the topic of single-sign-on to see how to do this from one login. 

Another vital component of availability is to backup the persisted data, so it is still possible to retrieve it and make it available if the storage parts of the system fail or are compromised. You should backup regularly and keep at least some separate so that physical failures (a fire in the office, for example) don't cause a loss of all the data, and if a backup goes wrong, it doesn't overwrite a good backup set. 

Maximizing Performance

Identifying the Challenge

For modest applications such as small office automation, most local data persistence technologies will be more than fast enough for everyday tasks such as editing, updating, and finding.

Exploring Solutions

When performance is essential, there are two features to explore: latency and bandwidth.

Latency is the time it takes from request to response. For many local data operations, this is fast enough that humans don't notice - if someone presses save and the system is on to the next task within a few hundred milliseconds, they feel that is responsive. Cloud-based persistence, however, can have high, noticeable latencies because of the various network links, intermediate servers, and sometimes heavy-weight browser interfaces. Test these solutions, and if after every task you spend a few seconds staring at an updating icon, then you should probably check with the users if it matters to them.

Poorly thought-out queries of large datasets can also cause latency. Most everyday datasets - a few hundred thousand entries or less - are small enough to be easily queried in modern databases on modern hardware without noticeable latency. For large databases that you find are slow to query, you can explore topics such as:

  • Indexing: The fields that you query so that the database can more quickly filter.

  • Joining: Associating data items together (you will need to know about indexes).

  • Distributed: Databases (such as Hadoop) so that the effort of querying can be split up and shared. Cloud systems can be helpful here. Bear in mind that not all queries can be split up cleanly - distributed systems will not automatically solve your data performance problem. 

Bandwidth is the amount of data that can be fed through a network connection over time. Again, most modern networks on modern hardware are more than sufficient for everyday use. If you have large datasets, you may find that you need to limit your queries so that if someone accidentally asks for all ten billion items in a catalog, they only get a limited amount. Neither they nor the database is overwhelmed. 

Turning to Systems Thinking

Identifying the Challenge

We have considered the reasonably simple case of storing a list of a particular class; in one example, the shop's customers as person objects and in the other the shop's goods catalog as robot part objects.

In practice, even our shop's software system becomes far more complex. You want to store orders by customers of robot parts, track their progress to packing and delivery, order more supplies, and so on.

Exploring Solutions

To understand this, break the system down into parts so that you can deal with the internal complexity of each part separately from the complexity of the full assembly. It can be challenging to understand where and how to break such systems down, especially at the early stage when the problems are not well understood, let alone the full design.

For example, do you separate your system into functional layers: the user interface layer, the business processing layer, and the database layer? This approach means you can replace the user interface layer without largely disturbing the rest. It might be useful if you move from a desktop to a tablet or mobile user interface.

Another example is to separate your system into business aspects: maintaining the customer lists (sales & marketing), tracking orders (warehouse), maintaining the parts catalog (robot overlord consultants), and more. This process can be helpful if you want to develop the different components independently due to the different needs of the departments.

The topic to explore here is systems thinking. It is a massive subject, and worth skimming a bit before deciding which areas you may want to specialize in. We have also covered design patterns, which can provide tried and tested concepts that can be reused.

Selecting Frameworks

Identifying the Challenge

As software engineers, we can be over-enthusiastic about turning to the latest technology or practice as a silver bullet that solves all our problems. Often, the more enthusiastic evangelists claim that all that has gone before was rubbish, incompetent, and now obsolete. If you don't buy into it, then you are obsolete. And if it goes wrong, it was not being used correctly.

The result, for technologies, can be layers and layers of frameworks and technologies added on top of each other until you end up with a cobbled-together aggregate mess that is fragile to change.

For instance, in this course, you saw that you could install an ORM, a JDBC driver, and a relational database and then configure all of them to persist even simple objects.

You should understand what the trade-offs are when you build persistence systems. Be pragmatic rather than idealistic. Assemble a toolbox of technologies and working practices that you can use for a particular task. And you should focus on your customers and what they need.

Exploring Solutions

When evaluating frameworks to see if you should use them for your task, here are some trade-offs to consider:

  • Learning Effort: How much time will it take to learn, install, configure, and maintain the framework? How does this compare with the savings that it promises - and bearing in mind the promises may be over-inflated?

  • Maintenance: Related to the above, if it takes you a while to learn it, will that also mean that people that come after you also have to spend a lot of effort learning it to make small changes?

  • Fragility: How well does the framework handle mistakes? Some - and I'm looking at you Spring - tend to just break with error messages that provide little clue to what went wrong. Frameworks that have hidden or magic behaviors that cannot be traced through the code are particularly fragile.

  • Lost Capability: As you saw in the previous chapters of this part, using standard interfaces can decouple components, but also reduce the available specialist capabilities. The same applies to frameworks. How much do you need that decoupling? How much do you need that specialist capability?

  • Reuse: You can get frustrated having to write code to solve similar problems over and over, and it can be tempting to write one code set to be reusable. This is not as simple as it seems - the reused library has to remain compatible with all the situations in which it's used. Sometimes it is better to reuse designs, skills, and concepts rather than trying to maintain common, reusable code sets.  

  • Beware Silver Bullets: Silver bullets are those that magically solve a particular problem. While some frameworks claim to do this, they nearly always bring other problems. The sales team of the new technology won't help you find them. 

In general, try to be practical rather than follow some abstract rules. Going for purist answers and following some dogmatic principles may lead to aesthetically pleasing structures that look beautiful to the creator, but are impossible to use. Think about what's feasible, and map out the best solution to get there.

Let's Recap!

In this chapter, I introduced the exciting range of topics you can explore to become more proficient at persistence - and so more attractive to potential customers! We covered:

  • Multi-user databases and the concurrency topics to explore.

  • Complex objects and atomic operations to ensure data is consistent.

  • Security and the CIA acronym for the key topics of confidentiality, integrity, and availability.

  • Performance and latency, bandwidth, and data structure topics that will help you speed your data access.

  • Systems thinking as a general approach to help you understand and improve complex systems.

  • Selecting frameworks and a checklist of aspects to consider when doing so.

 Now, let's recap all that you've learned in the course!  

Ever considered an OpenClassrooms diploma?
  • Up to 100% of your training program funded
  • Flexible start date
  • Career-focused projects
  • Individual mentoring
Find the training program and funding option that suits you best
Example of certificate of achievement
Example of certificate of achievement