Evaluating Model Performance 1
Important
A model that is overfit or underfit is a bad predictor of outcomes outside of the data set and should not be used. In the field of data science, models tend to be overfit, so model selection techniques focus on choosing the least complex model that captures the general trend.
species | island | bill_length_mm | bill_depth_mm | flipper_length_mm | body_mass_g | sex | year | |
---|---|---|---|---|---|---|---|---|
0 | Adelie | Torgersen | 39.10 | 18.70 | 181.00 | 3,750.00 | male | 2007 |
1 | Adelie | Torgersen | 39.50 | 17.40 | 186.00 | 3,800.00 | female | 2007 |
2 | Adelie | Torgersen | 40.30 | 18.00 | 195.00 | 3,250.00 | female | 2007 |
4 | Adelie | Torgersen | 36.70 | 19.30 | 193.00 | 3,450.00 | female | 2007 |
5 | Adelie | Torgersen | 39.30 | 20.60 | 190.00 | 3,650.00 | male | 2007 |
R-squared, \(R^2\) : Percentage of variability in the outcome explained by the regression model (in the context of SLR, the predictor)
\[ R^2 = \frac{\text{variation explained by regression}}{\text{total variation in the data}} = \frac{\sum (\hat{y}_i - \bar{y})^2}{\sum (y_i - \bar{y})^2} \\ R^2 = 1 - \frac{\sum (\hat{y}_i - y_i)}{\sum (y_i - \bar{y})^2} \]
Root mean square error, RMSE: A measure of the average error (average difference between observed and predicted values of the outcome)
\[ RMSE = \sqrt{\frac{\sum_{i = 1}^n (y_i - \hat{y}_i)^2}{n}} \]
What indicates a good model fit? Higher or lower \(R^2\)? Higher or lower RMSE?
model.score(X, y)
:The \(R^2\) of the model for predicting penguin mass from bill length is 25%. Which of the following is the correct interpretation of this value?
Ranges between 0 (perfect predictor) and infinity (terrible predictor)
Same units as the outcome variable
Calculate with means_squared_error(y_true, y_pred)
:
The value of RMSE is not very meaningful on its own, but it’s useful for comparing across models.
Comparing a model that uses bill length for a predictor or using flipper length
Positive (predicted) | Negative (predicted) | |
---|---|---|
Positive (actual) | 170 | 21 |
Negative (actual) | 1 | 377 |
Positive (predicted) | Negative (predicted) | |
---|---|---|
Positive (actual) | 170 | 21 |
Negative (actual) | 1 | 377 |
What is the Accuracy, Precision, and Recall?