Explore key concepts of linear and logistic regression with practical questions designed to strengthen your understanding of model assumptions, variable interpretations, and real-world applications. This quiz helps learners distinguish between regression types, predictors, and statistical outputs while avoiding common misconceptions.
When predicting whether a student will pass or fail an exam based on study hours, which type of regression should be used?
Explanation: Logistic regression is appropriate when the outcome is binary, such as pass or fail. Linear regression is suited for predicting continuous numerical values, not classifications. Polynomial regression models nonlinear relationships but is not used for classification problems. Stepwise regression is a variable selection method, not a classification technique.
Which of the following is an assumption of linear regression when modeling house prices based on square footage?
Explanation: Homoscedasticity refers to constant variance of errors, a key assumption in linear regression. Multicollinearity refers to highly correlated predictors and is a condition to be avoided, not assumed. Multinomial outcomes require other types of models, not linear regression. Non-binary response is a property of the outcome, not an assumption.
In logistic regression, what does the coefficient for an independent variable represent?
Explanation: Coefficients in logistic regression describe how the log odds of the outcome change with each unit increase in the independent variable. They do not show direct changes in probability, which requires further transformation. The mean difference option pertains more to t-tests, not logistic regression. The direct effect in percent is not the interpretation of the raw coefficient.
Which metric is commonly used to assess the goodness-of-fit in a linear regression model predicting car prices?
Explanation: R-squared quantifies how well the model explains the variance in the dependent variable. The Gini coefficient is used for inequality measurements and classification performance, not regression fit. Confusion matrices and F-measure apply to classification tasks, not regression models.
If a categorical variable with three categories is to be included in a linear regression model, what must be done before including it?
Explanation: Linear regression requires numeric predictors, so categorical variables must be converted to dummy (indicator) variables before modeling. Ignoring the variable leads to information loss. Summing categories is inappropriate as it imposes a non-existent order. Entering as text will cause errors since the model cannot process text data.