- 10 hours
- Medium
Free online content available in this course.
course.header.alt.is_video
course.header.alt.is_certifying
Got it!Last updated on 4/24/20
Reduce Dimensions in your Data Using Principal Component Analysis
Evaluated skills
- Carry out a principal component analysis
Description
For questions 1 to 6, you will be carrying out a principal component analysis on the wine quality dataset.
Before we dive in, let's import the libraries we need using the following code:
import pandas as pd
import numpy as np
from functions import *
Now, let's load the data into a Pandas data frame called orginal_data:
original_data = pd.read_csv('winequality-red.csv')
original_data.head()
Question 1
Which of the variables should you keep for your analysis?
Careful, there are several correct answers.Only fixed acidity, volatile acidity and citric acid because they are the most relevant to our analysis
All of them, because they are all quantitative variables
All of them, because they all look relevant to our analysis of wine
Only fixed acidity, volatile acidity and citric acid because they are the only qualitative variables
Question 2
How many nulls does our data contain?
1
3
0
2
Question 3
After cleaning and preparing the data, you feel ready to carry out a PCA, but a colleague recommends you use the
describe()
method first. Why do you have to do this?Careful, there are several correct answers.To check the range of values for the variables.
To decide what variables needs to be normalized.
To evaluate the confidence interval of the variables.
To select the variables for the PCA.
- Up to 100% of your training program funded
- Flexible start date
- Career-focused projects
- Individual mentoring