Learn how to find the best hyperparameters for your ML models using Grid Search and Random Search.

Hyperparameter Tuning: Grid Search vs Random Search

Your model has knobs you can adjust. Tuning them properly can dramatically improve performance.

Parameters vs Hyperparameters

**Parameters:** Learned from data during training - Linear regression weights - Neural network weights

**Hyperparameters:** Set before training, control the learning process - Learning rate - Number of trees in random forest - Regularization strength (C, alpha) - Max depth of decision tree

You choose hyperparameters. The model learns parameters.

Why Tune Hyperparameters?

Default values aren't optimal for your specific data:

```python # Default might give 78% accuracy model = RandomForestClassifier()

Tuned might give 89% accuracy! model = RandomForestClassifier(n_estimators=200, max_depth=15, min_samples_split=5) ```

Method 1: Grid Search

Try ALL combinations of specified values.

```python from sklearn.model_selection import GridSearchCV from sklearn.ensemble import RandomForestClassifier

Define parameter grid param_grid = { 'n_estimators': [50, 100, 200], 'max_depth': [5, 10, 15, None], 'min_samples_split': [2, 5, 10] }

Total combinations: 3 × 4 × 3 = 36

model = RandomForestClassifier() grid_search = GridSearchCV( model, param_grid, cv=5, # 5-fold cross-validation scoring='accuracy', n_jobs=-1 # Use all CPU cores )

grid_search.fit(X_train, y_train)

print(f"Best parameters: {grid_search.best_params_}") print(f"Best CV score: {grid_search.best_score_:.3f}")

Use the best model best_model = grid_search.best_estimator_ predictions = best_model.predict(X_test) ```

### Grid Search Pros/Cons

✅ Exhaustive - tries everything ✅ Guaranteed to find best combo in the grid ❌ Exponentially expensive: 3 params × 10 values each = 1000 combos! ❌ Wastes time on bad regions

Method 2: Random Search

Sample random combinations from parameter distributions.

```python from sklearn.model_selection import RandomizedSearchCV from scipy.stats import randint, uniform

Define parameter distributions param_dist = { 'n_estimators': randint(50, 300), # Random integer 50-300 'max_depth': randint(3, 30), # Random integer 3-30 'min_samples_split': randint(2, 20), # Random integer 2-20 'min_samples_leaf': randint(1, 10), # Random integer 1-10 'max_features': uniform(0.1, 0.9) # Random float 0.1-1.0 }

model = RandomForestClassifier() random_search = RandomizedSearchCV( model, param_dist, n_iter=50, # Try 50 random combinations cv=5, scoring='accuracy', n_jobs=-1, random_state=42 )

random_search.fit(X_train, y_train)

print(f"Best parameters: {random_search.best_params_}") print(f"Best CV score: {random_search.best_score_:.3f}") ```

### Random Search Pros/Cons

✅ Explores more of the space with fewer evaluations ✅ Better for high-dimensional spaces ✅ Often finds good solutions faster ❌ Might miss the optimal if unlucky ❌ Not exhaustive

When to Use Which?

| Situation | Recommendation | |-----------|---------------| | Few hyperparameters (≤3) | Grid Search | | Many hyperparameters (>3) | Random Search | | Continuous parameters | Random Search | | Quick exploration | Random Search | | Final fine-tuning | Grid Search (narrow range) |

Practical Strategy

### Step 1: Random Search (Broad) ```python # Explore large range param_dist = { 'max_depth': randint(1, 50), 'learning_rate': uniform(0.001, 0.5) } random_search.fit(X_train, y_train) # Found: max_depth ≈ 10, learning_rate ≈ 0.1 ```

### Step 2: Grid Search (Narrow) ```python # Fine-tune around best values param_grid = { 'max_depth': [8, 9, 10, 11, 12], 'learning_rate': [0.05, 0.08, 0.1, 0.12, 0.15] } grid_search.fit(X_train, y_train) ```

Common Hyperparameters to Tune

### Logistic Regression ```python param_grid = { 'C': [0.001, 0.01, 0.1, 1, 10, 100], 'penalty': ['l1', 'l2'], 'solver': ['liblinear', 'saga'] } ```

### Random Forest ```python param_grid = { 'n_estimators': [100, 200, 300], 'max_depth': [10, 20, 30, None], 'min_samples_split': [2, 5, 10], 'min_samples_leaf': [1, 2, 4] } ```

### XGBoost ```python param_grid = { 'n_estimators': [100, 200, 300], 'max_depth': [3, 5, 7], 'learning_rate': [0.01, 0.1, 0.3], 'subsample': [0.8, 0.9, 1.0] } ```

### SVM ```python param_grid = { 'C': [0.1, 1, 10], 'kernel': ['rbf', 'linear', 'poly'], 'gamma': ['scale', 'auto', 0.1, 1] } ```

Visualizing Results

```python import pandas as pd import matplotlib.pyplot as plt

Get results results = pd.DataFrame(grid_search.cv_results_)

Plot plt.figure(figsize=(10, 6)) for depth in [5, 10, 15]: mask = results['param_max_depth'] == depth plt.plot( results[mask]['param_n_estimators'], results[mask]['mean_test_score'], label=f'max_depth={depth}' )

plt.xlabel('n_estimators') plt.ylabel('CV Score') plt.legend() plt.title('Hyperparameter Tuning Results') plt.show() ```

Advanced: Bayesian Optimization

More efficient than grid/random search:

```python # pip install scikit-optimize from skopt import BayesSearchCV

search = BayesSearchCV( model, param_dist, n_iter=50, cv=5 ) search.fit(X_train, y_train) ```

Uses past evaluations to intelligently choose next points to try.

Key Tips

1. **Use cross-validation** - Don't tune on a single train-test split 2. **Start broad, then narrow** - Random search → Grid search 3. **Don't over-tune** - Risk of overfitting to validation set 4. **Keep a holdout test set** - Final evaluation on unseen data 5. **Set random_state** - For reproducibility

Code Template

```python from sklearn.model_selection import RandomizedSearchCV, GridSearchCV from sklearn.pipeline import Pipeline

Include preprocessing in pipeline pipeline = Pipeline([ ('scaler', StandardScaler()), ('model', RandomForestClassifier()) ])

Use stepname__param for nested params param_dist = { 'model__n_estimators': randint(50, 300), 'model__max_depth': randint(5, 30) }

search = RandomizedSearchCV(pipeline, param_dist, n_iter=30, cv=5) search.fit(X_train, y_train)

Final test print(f"Test score: {search.score(X_test, y_test):.3f}") ```

Key Takeaway

Hyperparameter tuning is essential for good performance. Start with Random Search to explore, then Grid Search to refine. Always use cross-validation and keep a separate test set for final evaluation!

Hyperparameter Tuning: Grid Search vs Random Search

Hyperparameter Tuning: Grid Search vs Random Search

Parameters vs Hyperparameters

Why Tune Hyperparameters?

Tuned might give 89% accuracy! model = RandomForestClassifier(n_estimators=200, max_depth=15, min_samples_split=5) ```

Method 1: Grid Search

Define parameter grid param_grid = { 'n_estimators': [50, 100, 200], 'max_depth': [5, 10, 15, None], 'min_samples_split': [2, 5, 10] }

Total combinations: 3 × 4 × 3 = 36

Use the best model best_model = grid_search.best_estimator_ predictions = best_model.predict(X_test) ```

Method 2: Random Search

When to Use Which?

Practical Strategy

Common Hyperparameters to Tune

Visualizing Results

Get results results = pd.DataFrame(grid_search.cv_results_)

Plot plt.figure(figsize=(10, 6)) for depth in [5, 10, 15]: mask = results['param_max_depth'] == depth plt.plot( results[mask]['param_n_estimators'], results[mask]['mean_test_score'], label=f'max_depth={depth}' )

Advanced: Bayesian Optimization

Key Tips

Code Template

Include preprocessing in pipeline pipeline = Pipeline([ ('scaler', StandardScaler()), ('model', RandomForestClassifier()) ])

Use stepname__param for nested params param_dist = { 'model__n_estimators': randint(50, 300), 'model__max_depth': randint(5, 30) }

Final test print(f"Test score: {search.score(X_test, y_test):.3f}") ```

Key Takeaway

More on ML

What is Machine Learning? A Simple Introduction

Supervised vs Unsupervised Learning Explained

Understanding Training, Validation, and Test Sets