AI7 min read
AutoML and Neural Architecture Search
Automate machine learning pipeline.
Dr. Patricia Moore
December 18, 2025
0.0k0
AI that builds AI.
What is AutoML?
Automatically find best model and hyperparameters.
Goal: Make ML accessible to everyone
Why AutoML?
- Save time on experimentation
- Find models you wouldn't try manually
- Optimize better than humans
- Democratize ML
Auto-Sklearn
Automated sklearn pipeline:
from autosklearn.classification import AutoSklearnClassifier
# Create AutoML model
automl = AutoSklearnClassifier(
time_left_for_this_task=3600, # 1 hour
per_run_time_limit=300, # 5 min per model
)
# Fit - tries many models automatically
automl.fit(X_train, y_train)
# Get best model
print(automl.show_models())
# Predict
predictions = automl.predict(X_test)
# See what worked best
print(automl.leaderboard())
TPOT - Genetic Programming
Evolves ML pipelines:
from tpot import TPOTClassifier
# Genetic algorithm to find best pipeline
tpot = TPOTClassifier(
generations=5,
population_size=50,
verbosity=2,
random_state=42
)
tpot.fit(X_train, y_train)
# Get accuracy
print(tpot.score(X_test, y_test))
# Export best pipeline as Python code
tpot.export('best_pipeline.py')
Generated pipeline might look like:
# Auto-generated by TPOT
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
pipeline = Pipeline([
('scaler', StandardScaler()),
('classifier', RandomForestClassifier(n_estimators=100, max_depth=10))
])
H2O AutoML
Enterprise-grade AutoML:
import h2o
from h2o.automl import H2OAutoML
h2o.init()
# Load data
train = h2o.import_file("train.csv")
# Specify target and features
y = "target"
X = train.columns
X.remove(y)
# Run AutoML
aml = H2OAutoML(max_models=20, max_runtime_secs=3600)
aml.train(x=X, y=y, training_frame=train)
# View leaderboard
lb = aml.leaderboard
print(lb.head())
# Best model
best_model = aml.leader
predictions = best_model.predict(test)
Neural Architecture Search (NAS)
Find best neural network architecture:
import keras_tuner as kt
def build_model(hp):
model = Sequential()
# Tune number of layers
for i in range(hp.Int('num_layers', 1, 5)):
model.add(Dense(
units=hp.Int(f'units_{i}', 32, 512, step=32),
activation=hp.Choice('activation', ['relu', 'tanh'])
))
model.add(Dense(10, activation='softmax'))
# Tune learning rate
model.compile(
optimizer=Adam(hp.Float('learning_rate', 1e-4, 1e-2, sampling='log')),
loss='categorical_crossentropy',
metrics=['accuracy']
)
return model
# Search for best architecture
tuner = kt.RandomSearch(
build_model,
objective='val_accuracy',
max_trials=50,
directory='nas_results'
)
tuner.search(X_train, y_train, epochs=10, validation_data=(X_val, y_val))
# Get best model
best_model = tuner.get_best_models(num_models=1)[0]
Optuna - Hyperparameter Optimization
import optuna
def objective(trial):
# Suggest hyperparameters
n_estimators = trial.suggest_int('n_estimators', 50, 500)
max_depth = trial.suggest_int('max_depth', 2, 32)
learning_rate = trial.suggest_float('learning_rate', 1e-4, 1e-1, log=True)
# Train model
model = XGBClassifier(
n_estimators=n_estimators,
max_depth=max_depth,
learning_rate=learning_rate
)
model.fit(X_train, y_train)
# Return metric to optimize
return model.score(X_val, y_val)
# Optimize
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=100)
# Best hyperparameters
print(f"Best params: {study.best_params}")
print(f"Best score: {study.best_value}")
# Visualize optimization
optuna.visualization.plot_optimization_history(study)
optuna.visualization.plot_param_importances(study)
Google Cloud AutoML
from google.cloud import automl
# Create client
client = automl.AutoMlClient()
# Create dataset
dataset = client.create_dataset(
parent=f"projects/{project_id}/locations/us-central1",
dataset={
"display_name": "my_dataset",
"image_classification_dataset_metadata": {}
}
)
# Import images
# Train model (automatically)
# Deploy model
# All handled by Google Cloud
Benefits
- Save development time
- Try many models automatically
- Good starting point
- Reproducible
Limitations
- Expensive (compute)
- Black box (less control)
- May not beat expert tuning
- Limited custom features
Best Practices
- Start with AutoML for baseline
- Use found architecture as starting point
- Combine with domain knowledge
- Set reasonable time budget
- Always validate manually
Remember
- AutoML is a powerful starting point
- Not a replacement for ML knowledge
- Great for prototyping
- Can discover surprising solutions
#AI#Advanced#AutoML