Virtual Labs

Data Preprocessing and Feature Engineering

Which method was used to handle missing values in the Age attribute?

a: Mean Explanation

Explanation

b: Median Explanation

Explanation

c: Mode Explanation

Explanation

d: Deletion Explanation

Explanation

Why was the mean preferred for filling missing values in the Age column?

a: It represents the average value of the data Explanation

Explanation

b: It removes duplicate values Explanation

Explanation

c: It increases dataset size Explanation

Explanation

d: It converts data into numeric form Explanation

Explanation

How were missing values in the Embarked attribute handled?

a: Most frequent value Explanation

Explanation

b: Mean value Explanation

Explanation

c: Median value Explanation

Explanation

d: Row removal Explanation

Explanation

Which attribute was encoded to convert categorical data into numerical form?

a: Age Explanation

Explanation

b: Fare Explanation

Explanation

c: Sex Explanation

Explanation

d: Pclass Explanation

Explanation

Which of the following best describes the role of encoding in machine learning models?

a: It improves model accuracy automatically Explanation

Explanation

b: It converts categorical attributes into a machine-readable format Explanation

Explanation

c: It removes irrelevant features Explanation

Explanation

d: It scales numerical data Explanation

Explanation

Which features were normalized in the experiment?

a: Sex and Embarked Explanation

Explanation

b: Age and Fare Explanation

Explanation

c: Pclass and Sex Explanation

Explanation

d: Embarked and Pclass Explanation

Explanation

What problem does normalization help prevent in machine learning models?

a: Missing values in data Explanation

Explanation

b: Bias due to large numerical feature values Explanation

Explanation

c: Incorrect class labels Explanation

Explanation

d: Overfitting of the dataset Explanation

Explanation

When was data visualization performed in the experiment?

a: Before loading the dataset Explanation

Explanation

b: Before handling missing values Explanation

Explanation

c: After preprocessing the dataset Explanation

Explanation

d: During dataset collection Explanation

Explanation

What is the main advantage of performing data visualization after preprocessing?

a: To convert categorical features into numbers Explanation

Explanation

b: To validate the effectiveness of preprocessing steps Explanation

Explanation

c: To train machine learning models Explanation

Explanation

d: To increase dataset size Explanation

Explanation

The final output of this experiment is:

a: A trained machine learning model Explanation

Explanation

b: Raw dataset Explanation

Explanation

c: Clean and preprocessed dataset Explanation

Explanation

d: Reduced dataset Explanation

Explanation