Python - Machine Learning Part 4: Model Evaluation and Optimization

Model evaluation ensures the machine learning model performs well on unseen data. Techniques like cross-validation, hyperparameter tuning, and confusion matrices help in this process.

Examples:

Train/Test Split

from sklearn.model_selection import train_test_split

from sklearn.linear_model import LogisticRegression

# Data

X = [[1], [2], [3], [4]]

y = [0, 0, 1, 1]

# Split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5)

# Model

model = LogisticRegression()

model.fit(X_train, y_train)

print(model.score(X_test, y_test))

Explanation: The model is evaluated on a test dataset to prevent overfitting.

Cross-Validation

from sklearn.model_selection import cross_val_score

from sklearn.tree import DecisionTreeClassifier

# Data

X = [[1], [2], [3], [4]]

y = [0, 0, 1, 1]

# Model

model = DecisionTreeClassifier()

scores = cross_val_score(model, X, y, cv=2)

print(scores)

Explanation: Cross-validation splits data into multiple subsets for a more robust evaluation.

Confusion Matrix

from sklearn.metrics import confusion_matrix

y_true = [0, 1, 0, 1]

y_pred = [0, 1, 1, 0]

print(confusion_matrix(y_true, y_pred))

Explanation: The confusion matrix provides a detailed breakdown of predictions vs. actual outcomes.