Identify Anomalies
Welcome to part three. We’re fast approaching the finish line!
The final stage for the dashboard is to calculate the applicant’s borrower score based on Global Bank’s lending criteria. For this, you’ll turn to the family-circumstance data table: this will tell you about the applicant’s ability to repay the mortgage.
However, Jane has mentioned that this table contains anomalies as a result of erroneous manual data entry. Using the data tab on your dashboard, you can see, for example, that Tigrane Lefebvre and Alexis Chevalier have 30 and 20 children, respectively, which is highly unlikely! 🤣
Furthermore, there are redundancies in the marital status categories, as there are two possible options for the same status: “Civil Union” and “Civil Partnership”.
In the next video, you’ll see how to quickly spot anomalies using a Power BI tool called Power Query Editor.
Correct Anomalies Before Importing
Once anomalies have been identified in the family-circumstance table, you need to correct them.
These are named quite intuitively and often quite familiar if you’re used to using other Microsoft Office products. Let’s take a look at these features together:
Transform Table
Group by | Reduces the table by grouping dimensions according to a rule (e.g., counting rows, only taking the maximum, etc.). |
First row as header | Makes the first line the column header. |
Transpose | Transposes rows into columns and vice versa. |
Reverse rows | Sorts rows in reverse order. |
Count rows | Turns your table into a simple tally of the number of rows. |
Transform Column
Data type/Detect types | Change your data type (text, date, number, etc.). |
Rename | Rename your column header. |
Replace values | Replace one value for another in certain columns. |
Fill | Automatically fill empty cells with the value from the non-empty cell before it. |
Pivot/Unpivot columns | Unpivot your table (useful for pivot tables), turning your columns into attribute-value pairs, where columns become rows. (https://docs.microsoft.com/en-us/power-query/unpivot-column) |
Move columns | Move your columns left or right. |
Convert to List | Convert a column from your table into a list. |
Text Transformation
Split column | Split one column into several by a delimiter. |
Change format | Add a prefix or suffix to your data, or apply text transformations (upper case/lower case, etc.). |
Merge columns | Merge several columns into one with a separator. |
Extract characters | Extract part of the text from each row in a column (set number of characters, etc.). |
Parse | Filters text for XML or JSON coded text. |
Number Functions
Statistics | Applies a statistical formula to your columns (calculating the average, etc.). |
Standard | Applies a mathematical calculation to rows in your column (adding/subtracting a number, etc.). |
Scientific | Applies a mathematical transformation to rows in your column such as absolute value or powers, etc. |
Trigonometry | Same as above, but for trigonometry (sin, cos, etc.) |
Rounding | Rounds your data up or down |
Information | Turns the rows in your column into a Boolean to answer a question such as “Is this value even or odd?” |
Date and Time Functions
Date | Applies date transformations (displaying the year/quarter, etc.). |
Time | Applies operations to the times on your dates. |
Length | Same as above. |
In the next video, you’ll see how to use the replace-values function to correct your data.
Save Time With M
As you just saw in the video, when you apply error-correction operations, a new step appears in Power Query Editor, to the right: Replace Values.
When you select this, you’ll see some programming language in the formula bar:
=Table.ReplaceValue(#"Replaced Value1”,"Civil Partnership","Civil Union",Replacer.ReplaceText,{"Family Circumstance"})
Table.ReplaceValue
is the replacement function, to which the following parameters are passed:
§s table:
#”Replaced Value1”
, is the name of the final step applied to the family circumstance table. This step was created automatically by Power Query to assign the right data type to each of the columns.The old value
“Unmarried”
and the replacement value“Single”
.The replacement operation to be carried out:
Replacer.ReplaceText
.The column to which the function should be applied:
“Family circumstances”
.
You can use Power Query Editor’s formula bar yourself to save time when making your queries.
Knowing certain formulas can also help you deal with errors:
Date formulas | Text formulas | Number formulas | Table formulas |
|
|
|
|
I’ve listed some of the most common functions in the table above. Their names are fairly self-explanatory, but you’ll find more information in this comprehensive list of available functions from Microsoft’s documentation that I’d encourage you to take a look at.
Just to check you’re following okay, I’ve got a quick question for you:
What result do we get from the formula Text.Clean("Power#(lf)BI"
?
Power#BI
PowerlfBI
PowerBI
The correct answer is number 3.
Let’s Recap!
In Power Query Editor, you can identify errors in your data using profiling.
To correct these errors, the Transform menu contains various functions such as
ReplaceValue
. With these, you can automatically apply transformation operations to your data.M language is the formula language used by Power BI to perform transformations on queries.
Knowing a few M language formulas may help you to debug your data-preparation operations in case of any errors, and save time.
In the next chapter, you’ll see how to connect your data tables to carry out cross-tab analysis!