Python - Machine Learning Part 4: Model Evaluation and Optimization
Model evaluation ensures the machine learning model performs well on unseen data. Techniques like cross-validation, hyperparameter tuning, and confusion matrices help in this process.
Examples:
Train/Test Split
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
# Data
X = [[1], [2], [3], [4]]
y = [0, 0, 1, 1]
# Split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5)
# Model
model = LogisticRegression()
model.fit(X_train, y_train)
print(model.score(X_test, y_test))
Explanation: The model is evaluated on a test dataset to prevent overfitting.
Cross-Validation
from sklearn.model_selection import cross_val_score
from sklearn.tree import DecisionTreeClassifier
# Data
X = [[1], [2], [3], [4]]
y = [0, 0, 1, 1]
# Model
model = DecisionTreeClassifier()
scores = cross_val_score(model, X, y, cv=2)
print(scores)
Explanation: Cross-validation splits data into multiple subsets for a more robust evaluation.
Confusion Matrix
from sklearn.metrics import confusion_matrix
y_true = [0, 1, 0, 1]
y_pred = [0, 1, 1, 0]
print(confusion_matrix(y_true, y_pred))
Explanation: The confusion matrix provides a detailed breakdown of predictions vs. actual outcomes.