ML8 min read

Hyperparameter Tuning: Grid Search vs Random Search

Learn how to find the best hyperparameters for your ML models using Grid Search and Random Search.

Sarah Chen
December 19, 2025
0.0k0

Hyperparameter Tuning: Grid Search vs Random Search

Your model has knobs you can adjust. Tuning them properly can dramatically improve performance.

Parameters vs Hyperparameters

Parameters: Learned from data during training

  • Linear regression weights
  • Neural network weights

Hyperparameters: Set before training, control the learning process

  • Learning rate
  • Number of trees in random forest
  • Regularization strength (C, alpha)
  • Max depth of decision tree

You choose hyperparameters. The model learns parameters.

Why Tune Hyperparameters?

Default values aren't optimal for your specific data:

# Default might give 78% accuracy
model = RandomForestClassifier()

# Tuned might give 89% accuracy!
model = RandomForestClassifier(n_estimators=200, max_depth=15, min_samples_split=5)

Method 1: Grid Search

Try ALL combinations of specified values.

from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import RandomForestClassifier

# Define parameter grid
param_grid = {
    'n_estimators': [50, 100, 200],
    'max_depth': [5, 10, 15, None],
    'min_samples_split': [2, 5, 10]
}

# Total combinations: 3 × 4 × 3 = 36

model = RandomForestClassifier()
grid_search = GridSearchCV(
    model,
    param_grid,
    cv=5,              # 5-fold cross-validation
    scoring='accuracy',
    n_jobs=-1          # Use all CPU cores
)

grid_search.fit(X_train, y_train)

print(f"Best parameters: {grid_search.best_params_}")
print(f"Best CV score: {grid_search.best_score_:.3f}")

# Use the best model
best_model = grid_search.best_estimator_
predictions = best_model.predict(X_test)

Grid Search Pros/Cons

✅ Exhaustive - tries everything
✅ Guaranteed to find best combo in the grid
❌ Exponentially expensive: 3 params × 10 values each = 1000 combos!
❌ Wastes time on bad regions

Method 2: Random Search

Sample random combinations from parameter distributions.

from sklearn.model_selection import RandomizedSearchCV
from scipy.stats import randint, uniform

# Define parameter distributions
param_dist = {
    'n_estimators': randint(50, 300),        # Random integer 50-300
    'max_depth': randint(3, 30),             # Random integer 3-30
    'min_samples_split': randint(2, 20),     # Random integer 2-20
    'min_samples_leaf': randint(1, 10),      # Random integer 1-10
    'max_features': uniform(0.1, 0.9)        # Random float 0.1-1.0
}

model = RandomForestClassifier()
random_search = RandomizedSearchCV(
    model,
    param_dist,
    n_iter=50,         # Try 50 random combinations
    cv=5,
    scoring='accuracy',
    n_jobs=-1,
    random_state=42
)

random_search.fit(X_train, y_train)

print(f"Best parameters: {random_search.best_params_}")
print(f"Best CV score: {random_search.best_score_:.3f}")

Random Search Pros/Cons

✅ Explores more of the space with fewer evaluations
✅ Better for high-dimensional spaces
✅ Often finds good solutions faster
❌ Might miss the optimal if unlucky
❌ Not exhaustive

When to Use Which?

Situation Recommendation
Few hyperparameters (≤3) Grid Search
Many hyperparameters (>3) Random Search
Continuous parameters Random Search
Quick exploration Random Search
Final fine-tuning Grid Search (narrow range)

Practical Strategy

Step 1: Random Search (Broad)

# Explore large range
param_dist = {
    'max_depth': randint(1, 50),
    'learning_rate': uniform(0.001, 0.5)
}
random_search.fit(X_train, y_train)
# Found: max_depth ≈ 10, learning_rate ≈ 0.1

Step 2: Grid Search (Narrow)

# Fine-tune around best values
param_grid = {
    'max_depth': [8, 9, 10, 11, 12],
    'learning_rate': [0.05, 0.08, 0.1, 0.12, 0.15]
}
grid_search.fit(X_train, y_train)

Common Hyperparameters to Tune

Logistic Regression

param_grid = {
    'C': [0.001, 0.01, 0.1, 1, 10, 100],
    'penalty': ['l1', 'l2'],
    'solver': ['liblinear', 'saga']
}

Random Forest

param_grid = {
    'n_estimators': [100, 200, 300],
    'max_depth': [10, 20, 30, None],
    'min_samples_split': [2, 5, 10],
    'min_samples_leaf': [1, 2, 4]
}

XGBoost

param_grid = {
    'n_estimators': [100, 200, 300],
    'max_depth': [3, 5, 7],
    'learning_rate': [0.01, 0.1, 0.3],
    'subsample': [0.8, 0.9, 1.0]
}

SVM

param_grid = {
    'C': [0.1, 1, 10],
    'kernel': ['rbf', 'linear', 'poly'],
    'gamma': ['scale', 'auto', 0.1, 1]
}

Visualizing Results

import pandas as pd
import matplotlib.pyplot as plt

# Get results
results = pd.DataFrame(grid_search.cv_results_)

# Plot
plt.figure(figsize=(10, 6))
for depth in [5, 10, 15]:
    mask = results['param_max_depth'] == depth
    plt.plot(
        results[mask]['param_n_estimators'],
        results[mask]['mean_test_score'],
        label=f'max_depth={depth}'
    )

plt.xlabel('n_estimators')
plt.ylabel('CV Score')
plt.legend()
plt.title('Hyperparameter Tuning Results')
plt.show()

Advanced: Bayesian Optimization

More efficient than grid/random search:

# pip install scikit-optimize
from skopt import BayesSearchCV

search = BayesSearchCV(
    model,
    param_dist,
    n_iter=50,
    cv=5
)
search.fit(X_train, y_train)

Uses past evaluations to intelligently choose next points to try.

Key Tips

  1. Use cross-validation - Don't tune on a single train-test split
  2. Start broad, then narrow - Random search → Grid search
  3. Don't over-tune - Risk of overfitting to validation set
  4. Keep a holdout test set - Final evaluation on unseen data
  5. Set random_state - For reproducibility

Code Template

from sklearn.model_selection import RandomizedSearchCV, GridSearchCV
from sklearn.pipeline import Pipeline

# Include preprocessing in pipeline
pipeline = Pipeline([
    ('scaler', StandardScaler()),
    ('model', RandomForestClassifier())
])

# Use stepname__param for nested params
param_dist = {
    'model__n_estimators': randint(50, 300),
    'model__max_depth': randint(5, 30)
}

search = RandomizedSearchCV(pipeline, param_dist, n_iter=30, cv=5)
search.fit(X_train, y_train)

# Final test
print(f"Test score: {search.score(X_test, y_test):.3f}")

Key Takeaway

Hyperparameter tuning is essential for good performance. Start with Random Search to explore, then Grid Search to refine. Always use cross-validation and keep a separate test set for final evaluation!

#Machine Learning#Hyperparameters#Grid Search#Random Search#Beginner