In this article, we'll take a look at
Show
9. MODEL EVALUATION AND METRICS
Model evaluation helps us check how well our machine learning models are performing. We use different metrics depending on whether it’s a classification or regression problem.
9.1) Classification Metrics
Used when your model predicts categories or classes (e.g., spam or not spam).
9.1.1) Accuracy
How often is the model correct? Formula: (Correct Predictions) / (Total Predictions)
9.1.2) Precision
Out of all predicted positives, how many were actually positive? Used when false positives are costly. Formula: TP / (TP + FP)
9.1.3) Recall (Sensitivity)
Out of all actual positives, how many were predicted correctly? Used when missing positives is costly. Formula: TP / (TP + FN)
9.1.4) F1-Score
Balance between precision and recall. Formula: 2 * (Precision * Recall) / (Precision + Recall)
9.1.5) Confusion Matrix
A table showing True Positives, False Positives, False Negatives, and True Negatives.
9.1.6) ROC Curve (Receiver Operating Characteristic)
Shows the trade-off between True Positive Rate and False Positive Rate.
9.1.7) AUC (Area Under the Curve)
Measures the entire area under the ROC curve. Higher AUC = better performance.
9.2) Regression Metrics
Used when the model predicts continuous values (like house price, temperature).
9.2.1) Mean Absolute Error (MAE)
Average of the absolute errors. Easy to understand.
9.2.2) Mean Squared Error (MSE)
Average of squared errors. Penalizes large errors more than MAE.
9.2.3) R-Squared (R²)
Explains how much variance in the output is explained by the model. Ranges from 0 to 1 (higher is better).
9.2.4) Adjusted R-Squared
Like R², but adjusts for the number of predictors (features). Useful when comparing models with different numbers of features.
9.3) Cross-Validation
Used to test model performance on different splits of the data.
9.3.1) K-Fold Cross-Validation
Split data into k equal parts. Train on k-1 and test on the remaining part. Repeat k times.
9.3.2) Leave-One-Out Cross-Validation (LOOCV)
A special case of K-Fold where k = number of data points. Very slow but thorough.
9.3.3) Stratified K-Fold
Same as K-Fold, but keeps the ratio of classes the same in each fold. Useful for imbalanced datasets.
9.4) Hyperparameter Tuning
Hyperparameters are settings that control how a model learns (like learning rate, depth of a tree, etc.).
9.4.1) Grid Search
Tests all combinations of given hyperparameter values.
9.4.2) Random Search
Randomly selects combinations. Faster than Grid Search.
9.4.3) Bayesian Optimization
Uses past results to pick the next best combination. Smart and efficient.

Leave a Comment