AI6 min read

Anomaly Detection

Find unusual patterns in data.

Robert Anderson
December 18, 2025
0.0k0

Spot the unusual.

What is Anomaly Detection?

Finding data points that don't fit the pattern.

**Examples**: - Credit card fraud - Manufacturing defects - Network intrusion - Equipment failure

Statistical Method - Z-Score

```python import numpy as np

def detect_anomalies(data, threshold=3): mean = np.mean(data) std = np.std(data) anomalies = [] for i, value in enumerate(data): z_score = (value - mean) / std if abs(z_score) > threshold: anomalies.append(i) return anomalies

Find anomalies sales = [100, 105, 98, 102, 500, 97] # 500 is anomaly anomalies = detect_anomalies(sales) print(f"Anomalies at: {anomalies}") # [4] ```

Isolation Forest

Popular ML method:

```python from sklearn.ensemble import IsolationForest

Train on normal data model = IsolationForest(contamination=0.1) # Expect 10% anomalies model.fit(X_train)

Predict: -1 = anomaly, 1 = normal predictions = model.predict(X_test)

anomalies = X_test[predictions == -1] print(f"Found {len(anomalies)} anomalies") ```

One-Class SVM

```python from sklearn.svm import OneClassSVM

Train only on normal data model = OneClassSVM(nu=0.1) # nu = expected anomaly rate model.fit(X_normal)

Detect anomalies in new data predictions = model.predict(X_test) # -1 = anomaly, 1 = normal ```

Autoencoder Approach

Neural network that reconstructs input:

```python from tensorflow.keras.models import Model from tensorflow.keras.layers import Input, Dense

Build autoencoder input_layer = Input(shape=(features,)) encoded = Dense(32, activation='relu')(input_layer) encoded = Dense(16, activation='relu')(encoded) decoded = Dense(32, activation='relu')(encoded) decoded = Dense(features, activation='linear')(decoded)

autoencoder = Model(input_layer, decoded) autoencoder.compile(optimizer='adam', loss='mse')

Train on normal data autoencoder.fit(X_normal, X_normal, epochs=50)

Anomalies = high reconstruction error reconstructed = autoencoder.predict(X_test) errors = np.mean(np.abs(X_test - reconstructed), axis=1)

threshold = np.percentile(errors, 95) anomalies = errors > threshold ```

Local Outlier Factor

Finds points far from neighbors:

```python from sklearn.neighbors import LocalOutlierFactor

lof = LocalOutlierFactor(n_neighbors=20) predictions = lof.fit_predict(X)

-1 = anomaly, 1 = normal ```

Real Example - Credit Card Fraud

```python from sklearn.ensemble import IsolationForest

Transaction features: amount, location, time, etc. model = IsolationForest(contamination=0.01) # 1% fraud model.fit(transactions)

Check new transaction new_transaction = [[250, 2, 1545]] # [amount, location_id, hour] is_fraud = model.predict(new_transaction)

if is_fraud == -1: print("⚠️ Potential fraud detected!") ```

Evaluation

Hard because anomalies are rare!

Use precision, recall, F1-score

Remember

- Need mostly normal data for training - Isolation Forest often best choice - High false positives common - Combine with business rules

#AI#Intermediate#Anomaly