Model Metrics Mastery Quiz: Evaluating Machine Learning Performance — Questions & Answers

This quiz contains 10 questions. Below is a complete reference of all questions, answer choices, and correct answers. You can use this section to review after taking the interactive quiz above.

  1. Question 1: Understanding Accuracy

    In a binary classification problem, what does the accuracy metric measure?

    • The proportion of correct predictions to total predictions
    • The percentage of false positives among all positive predictions
    • The ratio of true negatives to false negatives
    • The number of true positives divided by total samples
    • The sum of recall and precision
    Show correct answer

    Correct answer: The proportion of correct predictions to total predictions

  2. Question 2: Precision vs. Recall

    If a medical test for a rare disease has high recall but low precision, what does this indicate about the test?

    • It often correctly detects most actual diseases cases, but also gives many false alarms
    • It rarely detects the disease but is always correct when it does
    • It never produces false positives
    • It has more true negatives than true positives
    • It always has a higher F1-score than accuracy
    Show correct answer

    Correct answer: It often correctly detects most actual diseases cases, but also gives many false alarms

  3. Question 3: True Negatives in Confusion Matrix

    Which part of the confusion matrix corresponds to true negatives?

    • The lower-right cell
    • The bottom-left cell
    • The top-right cell
    • The lower-middle cell
    • The upper-left cell
    Show correct answer

    Correct answer: The upper-left cell

  4. Question 4: Evaluating F1-Score

    Why might you prefer the F1-score over accuracy when evaluating a model on an imbalanced dataset?

    • F1-score balances both precision and recall, whereas accuracy can be misleading if classes are imbalanced
    • F1-score is only useful when there are no true negatives
    • Accuracy penalizes false positives more than false negatives
    • F1-score is easier to interpret than recall
    • F1-score is always higher than accuracy
    Show correct answer

    Correct answer: F1-score balances both precision and recall, whereas accuracy can be misleading if classes are imbalanced

  5. Question 5: Interpreting ROC-AUC

    What does a ROC-AUC score of 0.5 indicate about a classifier’s performance?

    • The classifier performs no better than random chance
    • The classifier is perfect
    • The classifier has high precision but low recall
    • The classifier predicts all classes correctly
    • The classifier has maximized the F1-score
    Show correct answer

    Correct answer: The classifier performs no better than random chance

  6. Question 6: Precision Calculation

    Given 80 true positives, 20 false positives, and 100 false negatives, what is the precision?

    • 0.80
    • 0.44
    • 0.20
    • 0.67
    • 0.50
    Show correct answer

    Correct answer: 0.80

  7. Question 7: Model Selection with Metrics

    You have two models: Model A with higher accuracy but lower recall, and Model B with slightly lower accuracy but much higher recall. When would Model B be preferred?

    • When missing positive cases is more costly than having false alarms
    • If true negatives are the most important
    • When the classes are perfectly balanced
    • If precision is the only metric to optimize
    • When Model A has a better ROC-AUC
    Show correct answer

    Correct answer: When missing positive cases is more costly than having false alarms

  8. Question 8: Averaging in Multi-class Problems

    Which method of averaging precision, recall, and F1-score treats all samples equally regardless of class size?

    • Macro-average
    • Micro-average
    • Weighted-average
    • Harmonic mean
    • Geometric mean
    Show correct answer

    Correct answer: Micro-average

  9. Question 9: Metric Implementation in Code

    Which of the following scikit-learn functions can you use to compute the F1-score for a binary classification problem in Python?

    • f1_score(y_true, y_pred)
    • accuracy(y_true, y_pred)
    • roc_auc_score(y_pred, y_true)
    • calculate_f1(y_pred, y_true)
    • recall_score(y_true, y_pred)
    Show correct answer

    Correct answer: f1_score(y_true, y_pred)

  10. Question 10: Ready for Production?

    A model deployed to production must meet certain metric thresholds on unseen data. Why is relying solely on training metrics a bad idea?

    • Training metrics can be overly optimistic and may not reflect true performance on unseen data
    • Training metrics are always lower than test metrics
    • Test metrics are ignored by most organizations
    • Unseen data has no impact on production decisions
    • Training metrics measure only recall
    Show correct answer

    Correct answer: Training metrics can be overly optimistic and may not reflect true performance on unseen data