AI7 min read

MLOps Basics

Best practices for ML in production.

Robert Anderson
December 18, 2025
0.0k0

Production ML best practices.

What is MLOps?

DevOps for Machine Learning.

Goal: Reliably deploy and maintain ML systems

Key Components

  1. Version Control: Code, data, models
  2. CI/CD: Automated testing and deployment
  3. Monitoring: Track model performance
  4. Reproducibility: Same code = same results

Version Control

# Git for code
git add model.py data_processing.py
git commit -m "Update model architecture"

# DVC for data and models
dvc add data/training_data.csv
dvc add models/model.pkl
git add data/training_data.csv.dvc models/model.pkl.dvc
git commit -m "Update data and model"

Experiment Tracking

import mlflow

# Start experiment
mlflow.set_experiment("house_price_prediction")

with mlflow.start_run():
    # Log parameters
    mlflow.log_param("n_estimators", 100)
    mlflow.log_param("max_depth", 10)
    
    # Train model
    model.fit(X_train, y_train)
    
    # Log metrics
    accuracy = model.score(X_test, y_test)
    mlflow.log_metric("accuracy", accuracy)
    
    # Log model
    mlflow.sklearn.log_model(model, "model")

Automated Training Pipeline

# training_pipeline.py
def training_pipeline():
    # 1. Load data
    data = load_data('s3://bucket/data.csv')
    
    # 2. Preprocess
    X, y = preprocess(data)
    X_train, X_test, y_train, y_test = train_test_split(X, y)
    
    # 3. Train
    model = train_model(X_train, y_train)
    
    # 4. Evaluate
    metrics = evaluate_model(model, X_test, y_test)
    
    # 5. Save if better
    if metrics['accuracy'] > current_best:
        save_model(model, 'production_model.pkl')
        
    return metrics

# Run automatically
if __name__ == "__main__":
    results = training_pipeline()
    print(f"New model accuracy: {results['accuracy']}")

CI/CD for ML

name: ML Pipeline

on:
  push:
    branches: [main]

jobs:
  train:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      
      - name: Set up Python
        uses: actions/setup-python@v2
        with:
          python-version: 3.9
      
      - name: Install dependencies
        run: pip install -r requirements.txt
      
      - name: Run tests
        run: pytest tests/
      
      - name: Train model
        run: python train.py
      
      - name: Validate model
        run: python validate_model.py
      
      - name: Deploy if tests pass
        run: python deploy.py

Monitoring

import prometheus_client
from flask import Flask

app = Flask(__name__)

# Metrics
predictions_counter = prometheus_client.Counter(
    'predictions_total',
    'Total predictions made'
)

prediction_time = prometheus_client.Histogram(
    'prediction_duration_seconds',
    'Time spent making prediction'
)

@app.route('/predict', methods=['POST'])
@prediction_time.time()
def predict():
    data = request.json
    
    # Make prediction
    prediction = model.predict([data['features']])
    
    # Track metrics
    predictions_counter.inc()
    
    return jsonify({'prediction': int(prediction[0])})

Data Drift Detection

from evidently.dashboard import Dashboard
from evidently.tabs import DataDriftTab

# Compare training vs production data
dashboard = Dashboard(tabs=[DataDriftTab()])
dashboard.calculate(reference_data, production_data)

# Alert if drift detected
if dashboard.metrics['data_drift']['share_of_drifted_features'] > 0.3:
    send_alert("Data drift detected!")

Model Registry

import mlflow

# Register model
model_uri = "runs:/abc123/model"
mlflow.register_model(model_uri, "house_price_model")

# Transition to production
client = mlflow.tracking.MlflowClient()
client.transition_model_version_stage(
    name="house_price_model",
    version=3,
    stage="Production"
)

# Load production model
model = mlflow.pyfunc.load_model("models:/house_price_model/Production")

Feature Store

from feast import FeatureStore

store = FeatureStore(repo_path=".")

# Get features
features = store.get_online_features(
    features=[
        'user_features:age',
        'user_features:income',
        'transaction_features:avg_amount'
    ],
    entity_rows=[{"user_id": 123}]
).to_dict()

# Use in prediction
prediction = model.predict([list(features.values())])

Best Practices

  1. Version everything: code, data, models
  2. Automate training: CI/CD pipelines
  3. Monitor constantly: accuracy, latency, drift
  4. Test thoroughly: unit tests, integration tests
  5. Document: model cards, data sheets

Tools Ecosystem

  • Experiment tracking: MLflow, Weights & Biases
  • Pipelines: Kubeflow, Airflow
  • Monitoring: Prometheus, Grafana
  • Serving: TensorFlow Serving, Seldon

Remember

  • MLOps = DevOps + ML
  • Automate everything
  • Monitor in production
  • Version all artifacts
#AI#Intermediate#MLOps