Python - Python Machine Learning - Standard Deviation Part 3: Importance of Standard Deviation in Machine Learning
Why is Standard Deviation Important?
Detecting Outliers: Data points that lie far from the mean indicate anomalies.
Feature Scaling: Many ML algorithms require standardized input, often achieved using standard deviation.
Model Performance Analysis: Evaluates the consistency of a model’s predictions.
Example 4: Identifying Outliers Using Standard Deviation
import numpy as np
data = [10, 12, 15, 14, 13, 110, 16, 18] # 110 is an outlier
mean = np.mean(data)
std_dev = np.std(data)
outliers = [x for x in data if abs(x - mean) > 2 * std_dev]
print("Outliers:", outliers)
Output:
Outliers: [110]
Explanation:
Any value more than 2 standard deviations from the mean is considered an outlier.
Here, 110 is significantly different from other values.