AI that builds AI.

What is AutoML?

Automatically find best model and hyperparameters.

**Goal**: Make ML accessible to everyone

Why AutoML?

- Save time on experimentation - Find models you wouldn't try manually - Optimize better than humans - Democratize ML

Auto-Sklearn

Automated sklearn pipeline:

```python from autosklearn.classification import AutoSklearnClassifier

Create AutoML model automl = AutoSklearnClassifier( time_left_for_this_task=3600, # 1 hour per_run_time_limit=300, # 5 min per model )

Fit - tries many models automatically automl.fit(X_train, y_train)

Get best model print(automl.show_models())

Predict predictions = automl.predict(X_test)

See what worked best print(automl.leaderboard()) ```

TPOT - Genetic Programming

Evolves ML pipelines:

```python from tpot import TPOTClassifier

Genetic algorithm to find best pipeline tpot = TPOTClassifier( generations=5, population_size=50, verbosity=2, random_state=42 )

tpot.fit(X_train, y_train)

Get accuracy print(tpot.score(X_test, y_test))

Export best pipeline as Python code tpot.export('best_pipeline.py') ```

Generated pipeline might look like:

```python # Auto-generated by TPOT from sklearn.ensemble import RandomForestClassifier from sklearn.preprocessing import StandardScaler from sklearn.pipeline import Pipeline

pipeline = Pipeline([ ('scaler', StandardScaler()), ('classifier', RandomForestClassifier(n_estimators=100, max_depth=10)) ]) ```

H2O AutoML

Enterprise-grade AutoML:

```python import h2o from h2o.automl import H2OAutoML

h2o.init()

Load data train = h2o.import_file("train.csv")

Specify target and features y = "target" X = train.columns X.remove(y)

Run AutoML aml = H2OAutoML(max_models=20, max_runtime_secs=3600) aml.train(x=X, y=y, training_frame=train)

View leaderboard lb = aml.leaderboard print(lb.head())

Best model best_model = aml.leader predictions = best_model.predict(test) ```

Neural Architecture Search (NAS)

Find best neural network architecture:

```python import keras_tuner as kt

def build_model(hp): model = Sequential() # Tune number of layers for i in range(hp.Int('num_layers', 1, 5)): model.add(Dense( units=hp.Int(f'units_{i}', 32, 512, step=32), activation=hp.Choice('activation', ['relu', 'tanh']) )) model.add(Dense(10, activation='softmax')) # Tune learning rate model.compile( optimizer=Adam(hp.Float('learning_rate', 1e-4, 1e-2, sampling='log')), loss='categorical_crossentropy', metrics=['accuracy'] ) return model

Search for best architecture tuner = kt.RandomSearch( build_model, objective='val_accuracy', max_trials=50, directory='nas_results' )

tuner.search(X_train, y_train, epochs=10, validation_data=(X_val, y_val))

Get best model best_model = tuner.get_best_models(num_models=1)[0] ```

Optuna - Hyperparameter Optimization

```python import optuna

def objective(trial): # Suggest hyperparameters n_estimators = trial.suggest_int('n_estimators', 50, 500) max_depth = trial.suggest_int('max_depth', 2, 32) learning_rate = trial.suggest_float('learning_rate', 1e-4, 1e-1, log=True) # Train model model = XGBClassifier( n_estimators=n_estimators, max_depth=max_depth, learning_rate=learning_rate ) model.fit(X_train, y_train) # Return metric to optimize return model.score(X_val, y_val)

Optimize study = optuna.create_study(direction='maximize') study.optimize(objective, n_trials=100)

Best hyperparameters print(f"Best params: {study.best_params}") print(f"Best score: {study.best_value}")

Visualize optimization optuna.visualization.plot_optimization_history(study) optuna.visualization.plot_param_importances(study) ```

Google Cloud AutoML

```python from google.cloud import automl

Create client client = automl.AutoMlClient()

Create dataset dataset = client.create_dataset( parent=f"projects/{project_id}/locations/us-central1", dataset={ "display_name": "my_dataset", "image_classification_dataset_metadata": {} } )

Import images # Train model (automatically) # Deploy model # All handled by Google Cloud ```

Benefits

- Save development time - Try many models automatically - Good starting point - Reproducible

Limitations

- Expensive (compute) - Black box (less control) - May not beat expert tuning - Limited custom features

Best Practices

1. Start with AutoML for baseline 2. Use found architecture as starting point 3. Combine with domain knowledge 4. Set reasonable time budget 5. Always validate manually

Remember

- AutoML is a powerful starting point - Not a replacement for ML knowledge - Great for prototyping - Can discover surprising solutions

AutoML and Neural Architecture Search