Imputation Basics
Which method replaces missing numeric data with the most common value in a column?
- Mode imputation
- Median impuation
- Mean imputation
- Forward fill
- Drop row
Handling Categorical Missing Values
If you have a missing value in a categorical feature, what is a commonly used placeholder to represent missing data?
- 'Unknown'
- 'Mean'
- '333'
- 'Nullify'
- 'Random'
Recognizing Inconsistent Data
You see the listed country values: 'USA', 'U.S.A.', 'United States', and 'usa'. What would best describe this scenario?
- Inconsistent data formatting
- Complete missing data
- Properly encoded dataset
- Noisy numeric data
- Type conversion error
Detecting Missing Data with Pandas
Which Pandas function would you use to detect missing values in a DataFrame?
- isnull()
- missing()
- fillna()
- imputate()
- dropdf()
Dropping Rows or Columns
In Pandas, which function is used to remove all rows with at least one missing value?
- dropna()
- removerows()
- trmna()
- isnotna()
- clear()
Forward Fill Usage
What does the 'ffill' method do when handling missing data in a time series?
- It fills missing values with the previous non-null value.
- It replaces missing values with zeros.
- It drops all remaining nulls at the end of the data.
- It duplicates the next valid entry.
- It fills missing values with random values.
Numeric Data Imputation
If you want to minimize the effect of outliers when filling missing numeric values, which method should you use?
- Median imputation
- Mean imputation
- Mode imputation
- Random sample imputation
- Zero imputation
Data Consistency
Given the dataset with 'M', 'F', and 'Femail' as possible entries for gender, what is the correct way to make the data consistent?
- Standardize entries like 'Femail' to 'F'
- Leave all as is
- Replace all 'M' and 'F' with 'Unknown'
- Remove all rows with 'Femail'
- Ignore inconsistencies
Advanced Imputation
You decide to use KNN imputation on missing values. What does KNN imputation primarily rely on?
- Similarity to nearby data points
- Random guessing
- Filling with overall mean
- Dropping rows with nulls
- Reversing the data columns
Identifying Incomplete Rows
Which code snippet identifies rows in a Pandas DataFrame 'df' that contain any missing values?
- df[df.isnull().any(axis=1)]
- df[df.notnull().all(axis=0)]
- df.fillna(df.mean())
- df[df.empty()]
- df.remove(nan=True)