Hyperparameter Tuning: Grid Search vs Random Search
Learn how to find the best hyperparameters for your ML models using Grid Search and Random Search.
Hyperparameter Tuning: Grid Search vs Random Search
Your model has knobs you can adjust. Tuning them properly can dramatically improve performance.
Parameters vs Hyperparameters
Parameters: Learned from data during training
- Linear regression weights
- Neural network weights
Hyperparameters: Set before training, control the learning process
- Learning rate
- Number of trees in random forest
- Regularization strength (C, alpha)
- Max depth of decision tree
You choose hyperparameters. The model learns parameters.
Why Tune Hyperparameters?
Default values aren't optimal for your specific data:
# Default might give 78% accuracy
model = RandomForestClassifier()
# Tuned might give 89% accuracy!
model = RandomForestClassifier(n_estimators=200, max_depth=15, min_samples_split=5)
Method 1: Grid Search
Try ALL combinations of specified values.
from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import RandomForestClassifier
# Define parameter grid
param_grid = {
'n_estimators': [50, 100, 200],
'max_depth': [5, 10, 15, None],
'min_samples_split': [2, 5, 10]
}
# Total combinations: 3 × 4 × 3 = 36
model = RandomForestClassifier()
grid_search = GridSearchCV(
model,
param_grid,
cv=5, # 5-fold cross-validation
scoring='accuracy',
n_jobs=-1 # Use all CPU cores
)
grid_search.fit(X_train, y_train)
print(f"Best parameters: {grid_search.best_params_}")
print(f"Best CV score: {grid_search.best_score_:.3f}")
# Use the best model
best_model = grid_search.best_estimator_
predictions = best_model.predict(X_test)
Grid Search Pros/Cons
✅ Exhaustive - tries everything
✅ Guaranteed to find best combo in the grid
❌ Exponentially expensive: 3 params × 10 values each = 1000 combos!
❌ Wastes time on bad regions
Method 2: Random Search
Sample random combinations from parameter distributions.
from sklearn.model_selection import RandomizedSearchCV
from scipy.stats import randint, uniform
# Define parameter distributions
param_dist = {
'n_estimators': randint(50, 300), # Random integer 50-300
'max_depth': randint(3, 30), # Random integer 3-30
'min_samples_split': randint(2, 20), # Random integer 2-20
'min_samples_leaf': randint(1, 10), # Random integer 1-10
'max_features': uniform(0.1, 0.9) # Random float 0.1-1.0
}
model = RandomForestClassifier()
random_search = RandomizedSearchCV(
model,
param_dist,
n_iter=50, # Try 50 random combinations
cv=5,
scoring='accuracy',
n_jobs=-1,
random_state=42
)
random_search.fit(X_train, y_train)
print(f"Best parameters: {random_search.best_params_}")
print(f"Best CV score: {random_search.best_score_:.3f}")
Random Search Pros/Cons
✅ Explores more of the space with fewer evaluations
✅ Better for high-dimensional spaces
✅ Often finds good solutions faster
❌ Might miss the optimal if unlucky
❌ Not exhaustive
When to Use Which?
| Situation | Recommendation |
|---|---|
| Few hyperparameters (≤3) | Grid Search |
| Many hyperparameters (>3) | Random Search |
| Continuous parameters | Random Search |
| Quick exploration | Random Search |
| Final fine-tuning | Grid Search (narrow range) |
Practical Strategy
Step 1: Random Search (Broad)
# Explore large range
param_dist = {
'max_depth': randint(1, 50),
'learning_rate': uniform(0.001, 0.5)
}
random_search.fit(X_train, y_train)
# Found: max_depth ≈ 10, learning_rate ≈ 0.1
Step 2: Grid Search (Narrow)
# Fine-tune around best values
param_grid = {
'max_depth': [8, 9, 10, 11, 12],
'learning_rate': [0.05, 0.08, 0.1, 0.12, 0.15]
}
grid_search.fit(X_train, y_train)
Common Hyperparameters to Tune
Logistic Regression
param_grid = {
'C': [0.001, 0.01, 0.1, 1, 10, 100],
'penalty': ['l1', 'l2'],
'solver': ['liblinear', 'saga']
}
Random Forest
param_grid = {
'n_estimators': [100, 200, 300],
'max_depth': [10, 20, 30, None],
'min_samples_split': [2, 5, 10],
'min_samples_leaf': [1, 2, 4]
}
XGBoost
param_grid = {
'n_estimators': [100, 200, 300],
'max_depth': [3, 5, 7],
'learning_rate': [0.01, 0.1, 0.3],
'subsample': [0.8, 0.9, 1.0]
}
SVM
param_grid = {
'C': [0.1, 1, 10],
'kernel': ['rbf', 'linear', 'poly'],
'gamma': ['scale', 'auto', 0.1, 1]
}
Visualizing Results
import pandas as pd
import matplotlib.pyplot as plt
# Get results
results = pd.DataFrame(grid_search.cv_results_)
# Plot
plt.figure(figsize=(10, 6))
for depth in [5, 10, 15]:
mask = results['param_max_depth'] == depth
plt.plot(
results[mask]['param_n_estimators'],
results[mask]['mean_test_score'],
label=f'max_depth={depth}'
)
plt.xlabel('n_estimators')
plt.ylabel('CV Score')
plt.legend()
plt.title('Hyperparameter Tuning Results')
plt.show()
Advanced: Bayesian Optimization
More efficient than grid/random search:
# pip install scikit-optimize
from skopt import BayesSearchCV
search = BayesSearchCV(
model,
param_dist,
n_iter=50,
cv=5
)
search.fit(X_train, y_train)
Uses past evaluations to intelligently choose next points to try.
Key Tips
- Use cross-validation - Don't tune on a single train-test split
- Start broad, then narrow - Random search → Grid search
- Don't over-tune - Risk of overfitting to validation set
- Keep a holdout test set - Final evaluation on unseen data
- Set random_state - For reproducibility
Code Template
from sklearn.model_selection import RandomizedSearchCV, GridSearchCV
from sklearn.pipeline import Pipeline
# Include preprocessing in pipeline
pipeline = Pipeline([
('scaler', StandardScaler()),
('model', RandomForestClassifier())
])
# Use stepname__param for nested params
param_dist = {
'model__n_estimators': randint(50, 300),
'model__max_depth': randint(5, 30)
}
search = RandomizedSearchCV(pipeline, param_dist, n_iter=30, cv=5)
search.fit(X_train, y_train)
# Final test
print(f"Test score: {search.score(X_test, y_test):.3f}")
Key Takeaway
Hyperparameter tuning is essential for good performance. Start with Random Search to explore, then Grid Search to refine. Always use cross-validation and keep a separate test set for final evaluation!