AI7 min read

AutoML and Neural Architecture Search

Automate machine learning pipeline.

Dr. Patricia Moore
December 18, 2025
0.0k0

AI that builds AI.

What is AutoML?

Automatically find best model and hyperparameters.

Goal: Make ML accessible to everyone

Why AutoML?

  • Save time on experimentation
  • Find models you wouldn't try manually
  • Optimize better than humans
  • Democratize ML

Auto-Sklearn

Automated sklearn pipeline:

from autosklearn.classification import AutoSklearnClassifier

# Create AutoML model
automl = AutoSklearnClassifier(
    time_left_for_this_task=3600,  # 1 hour
    per_run_time_limit=300,         # 5 min per model
)

# Fit - tries many models automatically
automl.fit(X_train, y_train)

# Get best model
print(automl.show_models())

# Predict
predictions = automl.predict(X_test)

# See what worked best
print(automl.leaderboard())

TPOT - Genetic Programming

Evolves ML pipelines:

from tpot import TPOTClassifier

# Genetic algorithm to find best pipeline
tpot = TPOTClassifier(
    generations=5,
    population_size=50,
    verbosity=2,
    random_state=42
)

tpot.fit(X_train, y_train)

# Get accuracy
print(tpot.score(X_test, y_test))

# Export best pipeline as Python code
tpot.export('best_pipeline.py')

Generated pipeline might look like:

# Auto-generated by TPOT
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline

pipeline = Pipeline([
    ('scaler', StandardScaler()),
    ('classifier', RandomForestClassifier(n_estimators=100, max_depth=10))
])

H2O AutoML

Enterprise-grade AutoML:

import h2o
from h2o.automl import H2OAutoML

h2o.init()

# Load data
train = h2o.import_file("train.csv")

# Specify target and features
y = "target"
X = train.columns
X.remove(y)

# Run AutoML
aml = H2OAutoML(max_models=20, max_runtime_secs=3600)
aml.train(x=X, y=y, training_frame=train)

# View leaderboard
lb = aml.leaderboard
print(lb.head())

# Best model
best_model = aml.leader
predictions = best_model.predict(test)

Neural Architecture Search (NAS)

Find best neural network architecture:

import keras_tuner as kt

def build_model(hp):
    model = Sequential()
    
    # Tune number of layers
    for i in range(hp.Int('num_layers', 1, 5)):
        model.add(Dense(
            units=hp.Int(f'units_{i}', 32, 512, step=32),
            activation=hp.Choice('activation', ['relu', 'tanh'])
        ))
    
    model.add(Dense(10, activation='softmax'))
    
    # Tune learning rate
    model.compile(
        optimizer=Adam(hp.Float('learning_rate', 1e-4, 1e-2, sampling='log')),
        loss='categorical_crossentropy',
        metrics=['accuracy']
    )
    
    return model

# Search for best architecture
tuner = kt.RandomSearch(
    build_model,
    objective='val_accuracy',
    max_trials=50,
    directory='nas_results'
)

tuner.search(X_train, y_train, epochs=10, validation_data=(X_val, y_val))

# Get best model
best_model = tuner.get_best_models(num_models=1)[0]

Optuna - Hyperparameter Optimization

import optuna

def objective(trial):
    # Suggest hyperparameters
    n_estimators = trial.suggest_int('n_estimators', 50, 500)
    max_depth = trial.suggest_int('max_depth', 2, 32)
    learning_rate = trial.suggest_float('learning_rate', 1e-4, 1e-1, log=True)
    
    # Train model
    model = XGBClassifier(
        n_estimators=n_estimators,
        max_depth=max_depth,
        learning_rate=learning_rate
    )
    model.fit(X_train, y_train)
    
    # Return metric to optimize
    return model.score(X_val, y_val)

# Optimize
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=100)

# Best hyperparameters
print(f"Best params: {study.best_params}")
print(f"Best score: {study.best_value}")

# Visualize optimization
optuna.visualization.plot_optimization_history(study)
optuna.visualization.plot_param_importances(study)

Google Cloud AutoML

from google.cloud import automl

# Create client
client = automl.AutoMlClient()

# Create dataset
dataset = client.create_dataset(
    parent=f"projects/{project_id}/locations/us-central1",
    dataset={
        "display_name": "my_dataset",
        "image_classification_dataset_metadata": {}
    }
)

# Import images
# Train model (automatically)
# Deploy model
# All handled by Google Cloud

Benefits

  • Save development time
  • Try many models automatically
  • Good starting point
  • Reproducible

Limitations

  • Expensive (compute)
  • Black box (less control)
  • May not beat expert tuning
  • Limited custom features

Best Practices

  1. Start with AutoML for baseline
  2. Use found architecture as starting point
  3. Combine with domain knowledge
  4. Set reasonable time budget
  5. Always validate manually

Remember

  • AutoML is a powerful starting point
  • Not a replacement for ML knowledge
  • Great for prototyping
  • Can discover surprising solutions
#AI#Advanced#AutoML