AI7 min read

Model Deployment Basics

Deploy ML models to production.

Robert Anderson
December 18, 2025
0.0k0

Make models available to users.

Why Deploy?

Model in notebook ≠ Value Model in production = Real impact!

Save Your Model

```python import joblib from sklearn.ensemble import RandomForestClassifier

Train model model = RandomForestClassifier() model.fit(X_train, y_train)

Save model joblib.dump(model, 'model.pkl')

Load model loaded_model = joblib.load('model.pkl') prediction = loaded_model.predict(X_new) ```

Flask API

Create a web API for your model:

```python from flask import Flask, request, jsonify import joblib

app = Flask(__name__)

Load model at startup model = joblib.load('model.pkl')

@app.route('/predict', methods=['POST']) def predict(): # Get data from request data = request.json features = [[data['age'], data['income']]] # Make prediction prediction = model.predict(features)[0] return jsonify({'prediction': int(prediction)})

if __name__ == '__main__': app.run(host='0.0.0.0', port=5000) ```

Test API

```python import requests

Send request response = requests.post('http://localhost:5000/predict', json={'age': 30, 'income': 50000})

result = response.json() print(f"Prediction: {result['prediction']}") ```

FastAPI (Modern Alternative)

```python from fastapi import FastAPI from pydantic import BaseModel import joblib

app = FastAPI() model = joblib.load('model.pkl')

class PredictionInput(BaseModel): age: int income: float

@app.post('/predict') def predict(input_data: PredictionInput): features = [[input_data.age, input_data.income]] prediction = model.predict(features)[0] return {'prediction': int(prediction)}

Run with: uvicorn main:app --reload ```

Docker Container

```dockerfile # Dockerfile FROM python:3.9-slim

WORKDIR /app

COPY requirements.txt . RUN pip install -r requirements.txt

COPY model.pkl . COPY app.py .

CMD ["python", "app.py"] ```

```bash # Build and run docker build -t ml-model . docker run -p 5000:5000 ml-model ```

Cloud Deployment

**AWS SageMaker**: ```python import sagemaker from sagemaker.sklearn import SKLearnModel

model = SKLearnModel( model_data='s3://bucket/model.tar.gz', role=role, entry_point='inference.py' )

predictor = model.deploy(instance_type='ml.m5.large') prediction = predictor.predict(data) ```

**Google Cloud AI Platform**: ```bash gcloud ai-platform models create my_model gcloud ai-platform versions create v1 \ --model my_model \ --origin gs://bucket/model \ --runtime-version 2.3 ```

Monitoring

Track model performance:

```python import logging

logging.basicConfig(level=logging.INFO)

@app.route('/predict', methods=['POST']) def predict(): data = request.json prediction = model.predict([data['features']]) # Log for monitoring logging.info(f"Input: {data['features']}, Prediction: {prediction}") return jsonify({'prediction': int(prediction[0])}) ```

Best Practices

1. Version your models 2. Monitor predictions 3. Handle errors gracefully 4. Add authentication 5. Use caching for frequent queries 6. Set up CI/CD pipeline

Remember

- Start with simple Flask API - Use Docker for consistency - Monitor model performance - Plan for model updates

#AI#Intermediate#Deployment