Mastering Missing Data: Choosing Between Mean, Median, Mode, or Drop Quiz

Missing Data Basics
When you encounter missing values in a dataset, which strategy involves removing entire rows that contain missing values?
1. A. Dropping
2. B. Averaging
3. C. Interpolating
4. D. Filling with minimum
5. E. Standardizing
Mean Imputation
If a column of numbers has missing values, which method replaces the missing values with the arithmetic average of the existing data?
1. A. Mode imputation
2. B. Mean imputation
3. C. Median replacement
4. D. Maximum imputation
5. E. Random sampling
Median Replacement
For a dataset containing outliers, which method is most robust: replacing missing values with the mean, median, or mode?
1. A. Mean
2. B. Minimum
3. C. Median
4. D. Mode
5. E. All give same result
Mode for Categorical Data
When handling missing values in a categorical column (e.g., color: red, blue, green), which imputation method is most appropriate?
1. A. Median
2. B. Mean
3. C. Mode
4. D. Drop the column
5. E. Use next value
Imputation vs. Deletion
If a dataset has only a few missing values, which action is generally safer to preserve data: imputing or dropping?
1. A. Dropping
2. B. Imputing
3. C. Replacing all
4. D. Ignoring missing
5. E. Normalizing
Mean Weakness
Why might replacing missing values with the mean not be the best choice in a skewed dataset?
1. A. Mean always equals median
2. B. Mean is sensitive to outliers
3. C. Mean is always higher
4. D. Mean ignores missing values
5. E. Mean is for category data
Unique Situations
If an entire column has all values missing, what is the most logical action?
1. A. Fill with mode
2. B. Replace with zeros
3. C. Drop the column
4. D. Forward fill
5. E. Fill with random values
Continuous vs. Categorical
Which method is least appropriate for dealing with missing data in a continuous numerical variable?
1. A. Mean imputation
2. B. Median infill
3. C. Zero replacement
4. D. Mode imputation
5. E. Interpolate
Consequence of Dropping
What is a potential downside of dropping all rows with missing data from your dataset?
1. A. Increased accuracy
2. B. Reduced sample size
3. C. Less missing data
4. D. More outliers
5. E. Extra variables
Real-life Example
Suppose a dataset records student scores, and some scores are missing. Which method would distort the highest if one student scored much higher than the rest?
1. A. Drop missing scores
2. B. Fill with mode
3. C. Fill with mean
4. D. Fill with median
5. E. Fill with minimum

Mastering Missing Data: Choosing Between Mean, Median, Mode, or Drop Quiz

Missing Data Basics

Mean Imputation

Median Replacement

Mode for Categorical Data

Imputation vs. Deletion

Mean Weakness

Unique Situations

Continuous vs. Categorical

Consequence of Dropping

Real-life Example