Where Is Data at Rest?
In the last chapter, we discussed securing data in transmission through encryption via HTTPS and TLS.
So where does this encrypted data go?
Consider a login page. Once a username and password are entered, the information will be encrypted and passed in a query to the database that will match it to its contents.
The data at rest is the content in the database.
Typically, a database will hold sensitive data on a site that requires authentication.
We have discussed ways to encrypt data as its being transmitted via HTTPS and TLS. How can you ensure that you're keeping the database safe?
First of all, some of you may be web developers that do not work directly on database configurations; however, it is important to learn all of these strategies because someday, you might. Also, if there are issues that can be resolved on the database side and you can resolve them, you have saved the day!
Teach the database admin a thing or two! 😁
Okay, back to saving the day.
We have looked at how to formulate queries to the database to prevent unauthorized changes and exposure of the database. But if exposure somehow happens, how can you soften the blow?
Encryption,
encryption,
encryption!
That is the key.
That’s a terrible pun. 😄
Let’s start with a password and talk about how you can safely encrypt it in this day and age. We used to be able to do a one-way hash using algorithms like SHA1 and MD5 for TLS; however, these algorithms are no longer trusted.
Let’s Look at Hash Algorithms!
The security feature of these one-way hashing algorithms is that they are not easy to invert. The rainbow table makes them easy to hack.
Let me show you what ‘password123’ looks like as a one-way hash:
sha1: bab1298d948ebb34bd0f3faf5e596ebc0b27c615
md5: 7576f3a00f6de47b0c72c5baf2d505b0
Brute force attacks use common password lists and keep trying until one hits. What’s really interesting about a rainbow table is that it has a list of the encrypted hashes of the passwords and uses that! Clever huh?
So now you know that a one-way hash is not secure. It can always be guessed.
What is our next resolution?
Salting! Nope, not an accessory on our picnic table!
Hashing Your Password
Salt is a technique used to supplement hashing to make it more random. To hash a password is to encrypt it using an encryption algorithm like SHA1 or MD5.
Since lists of common hashed passwords such as the rainbow tables exist for brute force, how can you add some randomization to your encryption?
The answer is, salt it. A salt is a random sequence used in addition to the encryption to make the hash unpredictable.
BCrypt and SCrypt are hash algorithms on steroids. As opposed to the SHA and MD5 algorithms, Argon5, PBKDF2, Bcrypt, and Scrypt are more focused on security than efficiency. That means that decrypting these algorithms is so resource intensive, that it takes a greater effort to break them.
When do you hash your passwords? Is it when someone enters their password in the login page, or in the database?
What Does a Secure Database Look Like?
Authentication and Encryption
At the basic level, you’d probably want to make sure that some warlord can’t just go into the network and log into the database administrator account, right? Can you think of some best practices for authenticating to the database?
Only use authentication through your operating system. For example, Microsoft SQL Server is best logged into using the Windows authentication system.
Remove default accounts.
Use least privilege to ensure only one or two accounts have delete privileges.
Least privilege is another fundamental concept in information security. With least privilege, you ensure that everyone in the organization only has the rights they need on the network to perform their specific function legally outlined when they were hired; no more and no less.
If you guessed in the database, you are correct! The password is transported securely using TLS, and when it is saved to the database, it undergoes the hashing algorithm.
Techniques to Hide Sensitive Data on a Database
Anonymization is a technique applied by the OWASP organization for hiding private data by encrypting, scrambling, and removing parts of data. For example, if a request is made for someone’s date of birth as an identifier, only the year will be provided by the database.
Pseudonymisation is a GDPR process that replaces PII with artificial identifiers and pseudonyms to hide sensitive data.
Data minimization allows a business process to abide by the GDPR ruleset. The business is only to request, store, and process PII that is required. In other words, any PII requested must have a solid business reason.
There Are Two Ways to Apply These Techniques to Your Database
Dynamic data masking is a way to apply anonymization rules on the columns of data that are sensitive. When a request is made to retrieve the sensitive data from the masked columns, it will not appear in full form. This data can also be hidden from database administrators in roles that block access to sensitive data.
On SQL Server, the MASKED WITH
option applies different masking functions. The default one takes the sensitive value and replaces it with another value when it's displayed. The example below shows its application on a Birthdate
column.
Birthdate DATE
MASKED WITH (FUNCTION = 'default()') NOT NULL,
MongoDB has an option to use a module called Mongo Mask on the web application front end. It would be imported in express like this: var mongoMask = require('mongo-mask')
.
Clone and generate data masking uses multiple databases to retrieve data. Using the same schema, an additional database used to retrieve information can have masked data for the sensitive columns already in there.
On SQL Server, SQL Clone and SQL Data Generator can be used to create another database with generated data to mask the sensitive data when retrieved. As opposed to dynamic data masking, this is not active masking. It is query redirection to another database with generated data to replace the sensitive data.
Let’s Recap!
Secure your database with encryption.
DO: Use hash algorithms that are secure such as Argon5, Scrypt, Bcrypt, and PBKDF2.
DON’T: Use just SHA and MD5 hash algorithms without a salt randomization.
Rainbow tables automate logins with pre-encrypted passwords!
Data masking can be used to secure sensitive data on the database.