Model Deployment Basics
Deploy ML models to production.
Make models available to users.
Why Deploy?
Model in notebook ≠ Value Model in production = Real impact!
Save Your Model
```python import joblib from sklearn.ensemble import RandomForestClassifier
Train model model = RandomForestClassifier() model.fit(X_train, y_train)
Save model joblib.dump(model, 'model.pkl')
Load model loaded_model = joblib.load('model.pkl') prediction = loaded_model.predict(X_new) ```
Flask API
Create a web API for your model:
```python from flask import Flask, request, jsonify import joblib
app = Flask(__name__)
Load model at startup model = joblib.load('model.pkl')
@app.route('/predict', methods=['POST']) def predict(): # Get data from request data = request.json features = [[data['age'], data['income']]] # Make prediction prediction = model.predict(features)[0] return jsonify({'prediction': int(prediction)})
if __name__ == '__main__': app.run(host='0.0.0.0', port=5000) ```
Test API
```python import requests
Send request response = requests.post('http://localhost:5000/predict', json={'age': 30, 'income': 50000})
result = response.json() print(f"Prediction: {result['prediction']}") ```
FastAPI (Modern Alternative)
```python from fastapi import FastAPI from pydantic import BaseModel import joblib
app = FastAPI() model = joblib.load('model.pkl')
class PredictionInput(BaseModel): age: int income: float
@app.post('/predict') def predict(input_data: PredictionInput): features = [[input_data.age, input_data.income]] prediction = model.predict(features)[0] return {'prediction': int(prediction)}
Run with: uvicorn main:app --reload ```
Docker Container
```dockerfile # Dockerfile FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt . RUN pip install -r requirements.txt
COPY model.pkl . COPY app.py .
CMD ["python", "app.py"] ```
```bash # Build and run docker build -t ml-model . docker run -p 5000:5000 ml-model ```
Cloud Deployment
**AWS SageMaker**: ```python import sagemaker from sagemaker.sklearn import SKLearnModel
model = SKLearnModel( model_data='s3://bucket/model.tar.gz', role=role, entry_point='inference.py' )
predictor = model.deploy(instance_type='ml.m5.large') prediction = predictor.predict(data) ```
**Google Cloud AI Platform**: ```bash gcloud ai-platform models create my_model gcloud ai-platform versions create v1 \ --model my_model \ --origin gs://bucket/model \ --runtime-version 2.3 ```
Monitoring
Track model performance:
```python import logging
logging.basicConfig(level=logging.INFO)
@app.route('/predict', methods=['POST']) def predict(): data = request.json prediction = model.predict([data['features']]) # Log for monitoring logging.info(f"Input: {data['features']}, Prediction: {prediction}") return jsonify({'prediction': int(prediction[0])}) ```
Best Practices
1. Version your models 2. Monitor predictions 3. Handle errors gracefully 4. Add authentication 5. Use caching for frequent queries 6. Set up CI/CD pipeline
Remember
- Start with simple Flask API - Use Docker for consistency - Monitor model performance - Plan for model updates