AI7 min read

MLOps Basics

Best practices for ML in production.

Robert Anderson
December 18, 2025
0.0k0

Production ML best practices.

What is MLOps?

DevOps for Machine Learning.

**Goal**: Reliably deploy and maintain ML systems

Key Components

1. **Version Control**: Code, data, models 2. **CI/CD**: Automated testing and deployment 3. **Monitoring**: Track model performance 4. **Reproducibility**: Same code = same results

Version Control

```bash # Git for code git add model.py data_processing.py git commit -m "Update model architecture"

DVC for data and models dvc add data/training_data.csv dvc add models/model.pkl git add data/training_data.csv.dvc models/model.pkl.dvc git commit -m "Update data and model" ```

Experiment Tracking

```python import mlflow

Start experiment mlflow.set_experiment("house_price_prediction")

with mlflow.start_run(): # Log parameters mlflow.log_param("n_estimators", 100) mlflow.log_param("max_depth", 10) # Train model model.fit(X_train, y_train) # Log metrics accuracy = model.score(X_test, y_test) mlflow.log_metric("accuracy", accuracy) # Log model mlflow.sklearn.log_model(model, "model") ```

Automated Training Pipeline

```python # training_pipeline.py def training_pipeline(): # 1. Load data data = load_data('s3://bucket/data.csv') # 2. Preprocess X, y = preprocess(data) X_train, X_test, y_train, y_test = train_test_split(X, y) # 3. Train model = train_model(X_train, y_train) # 4. Evaluate metrics = evaluate_model(model, X_test, y_test) # 5. Save if better if metrics['accuracy'] > current_best: save_model(model, 'production_model.pkl') return metrics

Run automatically if __name__ == "__main__": results = training_pipeline() print(f"New model accuracy: {results['accuracy']}") ```

CI/CD for ML

```.github/workflows/ml_pipeline.yml name: ML Pipeline

on: push: branches: [main]

jobs: train: runs-on: ubuntu-latest steps: - uses: actions/checkout@v2 - name: Set up Python uses: actions/setup-python@v2 with: python-version: 3.9 - name: Install dependencies run: pip install -r requirements.txt - name: Run tests run: pytest tests/ - name: Train model run: python train.py - name: Validate model run: python validate_model.py - name: Deploy if tests pass run: python deploy.py ```

Monitoring

```python import prometheus_client from flask import Flask

app = Flask(__name__)

Metrics predictions_counter = prometheus_client.Counter( 'predictions_total', 'Total predictions made' )

prediction_time = prometheus_client.Histogram( 'prediction_duration_seconds', 'Time spent making prediction' )

@app.route('/predict', methods=['POST']) @prediction_time.time() def predict(): data = request.json # Make prediction prediction = model.predict([data['features']]) # Track metrics predictions_counter.inc() return jsonify({'prediction': int(prediction[0])}) ```

Data Drift Detection

```python from evidently.dashboard import Dashboard from evidently.tabs import DataDriftTab

Compare training vs production data dashboard = Dashboard(tabs=[DataDriftTab()]) dashboard.calculate(reference_data, production_data)

Alert if drift detected if dashboard.metrics['data_drift']['share_of_drifted_features'] > 0.3: send_alert("Data drift detected!") ```

Model Registry

```python import mlflow

Register model model_uri = "runs:/abc123/model" mlflow.register_model(model_uri, "house_price_model")

Transition to production client = mlflow.tracking.MlflowClient() client.transition_model_version_stage( name="house_price_model", version=3, stage="Production" )

Load production model model = mlflow.pyfunc.load_model("models:/house_price_model/Production") ```

Feature Store

```python from feast import FeatureStore

store = FeatureStore(repo_path=".")

Get features features = store.get_online_features( features=[ 'user_features:age', 'user_features:income', 'transaction_features:avg_amount' ], entity_rows=[{"user_id": 123}] ).to_dict()

Use in prediction prediction = model.predict([list(features.values())]) ```

Best Practices

1. **Version everything**: code, data, models 2. **Automate training**: CI/CD pipelines 3. **Monitor constantly**: accuracy, latency, drift 4. **Test thoroughly**: unit tests, integration tests 5. **Document**: model cards, data sheets

Tools Ecosystem

- **Experiment tracking**: MLflow, Weights & Biases - **Pipelines**: Kubeflow, Airflow - **Monitoring**: Prometheus, Grafana - **Serving**: TensorFlow Serving, Seldon

Remember

- MLOps = DevOps + ML - Automate everything - Monitor in production - Version all artifacts

#AI#Intermediate#MLOps