Quiz: Mastering Handling Missing or Inconsistent Data in Datasets Quiz

  1. Imputation Basics

    Which method replaces missing numeric data with the most common value in a column?

    1. Mode imputation
    2. Median impuation
    3. Mean imputation
    4. Forward fill
    5. Drop row
  2. Handling Categorical Missing Values

    If you have a missing value in a categorical feature, what is a commonly used placeholder to represent missing data?

    1. 'Unknown'
    2. 'Mean'
    3. '333'
    4. 'Nullify'
    5. 'Random'
  3. Recognizing Inconsistent Data

    You see the listed country values: 'USA', 'U.S.A.', 'United States', and 'usa'. What would best describe this scenario?

    1. Inconsistent data formatting
    2. Complete missing data
    3. Properly encoded dataset
    4. Noisy numeric data
    5. Type conversion error
  4. Detecting Missing Data with Pandas

    Which Pandas function would you use to detect missing values in a DataFrame?

    1. isnull()
    2. missing()
    3. fillna()
    4. imputate()
    5. dropdf()
  5. Dropping Rows or Columns

    In Pandas, which function is used to remove all rows with at least one missing value?

    1. dropna()
    2. removerows()
    3. trmna()
    4. isnotna()
    5. clear()
  6. Forward Fill Usage

    What does the 'ffill' method do when handling missing data in a time series?

    1. It fills missing values with the previous non-null value.
    2. It replaces missing values with zeros.
    3. It drops all remaining nulls at the end of the data.
    4. It duplicates the next valid entry.
    5. It fills missing values with random values.
  7. Numeric Data Imputation

    If you want to minimize the effect of outliers when filling missing numeric values, which method should you use?

    1. Median imputation
    2. Mean imputation
    3. Mode imputation
    4. Random sample imputation
    5. Zero imputation
  8. Data Consistency

    Given the dataset with 'M', 'F', and 'Femail' as possible entries for gender, what is the correct way to make the data consistent?

    1. Standardize entries like 'Femail' to 'F'
    2. Leave all as is
    3. Replace all 'M' and 'F' with 'Unknown'
    4. Remove all rows with 'Femail'
    5. Ignore inconsistencies
  9. Advanced Imputation

    You decide to use KNN imputation on missing values. What does KNN imputation primarily rely on?

    1. Similarity to nearby data points
    2. Random guessing
    3. Filling with overall mean
    4. Dropping rows with nulls
    5. Reversing the data columns
  10. Identifying Incomplete Rows

    Which code snippet identifies rows in a Pandas DataFrame 'df' that contain any missing values?

    1. df[df.isnull().any(axis=1)]
    2. df[df.notnull().all(axis=0)]
    3. df.fillna(df.mean())
    4. df[df.empty()]
    5. df.remove(nan=True)