Learn Naive Bayes - a simple but powerful probabilistic classifier based on Bayes theorem.

Naive Bayes Classifier Explained

Naive Bayes is fast, simple, and surprisingly effective. It's based on probability theory and a "naive" assumption.

The Intuition

You get an email with words: "free", "winner", "click"

What's the probability it's spam?

Naive Bayes calculates: ``` P(spam | these words) vs P(not spam | these words) ```

Whichever is higher wins!

Bayes' Theorem

``` P(A|B) = P(B|A) × P(A) ───────────── P(B) ```

For classification: ``` P(class|features) ∝ P(features|class) × P(class) ```

- **P(class):** Prior probability (how common is spam?) - **P(features|class):** Likelihood (how common are these features in spam?) - **P(class|features):** Posterior (what we want to know!)

The "Naive" Assumption

Naive Bayes assumes features are **independent** given the class.

``` P(free, winner, click | spam) = P(free|spam) × P(winner|spam) × P(click|spam) ```

This is "naive" because features often ARE related. But it works anyway!

Example: Spam Classification

Training data: ``` Email 1: "free money now" → Spam Email 2: "meeting tomorrow" → Not Spam Email 3: "free gift winner" → Spam Email 4: "project update" → Not Spam ```

Calculate probabilities: ``` P(spam) = 2/4 = 0.5 P(not spam) = 2/4 = 0.5

P(free | spam) = 2/2 = 1.0 (appears in both spam) P(free | not spam) = 0/2 = 0.0

P(winner | spam) = 1/2 = 0.5 P(winner | not spam) = 0/2 = 0.0 ```

New email: "free winner" ``` P(spam | free, winner) ∝ 0.5 × 1.0 × 0.5 = 0.25 P(not spam | free, winner) ∝ 0.5 × 0.0 × 0.0 = 0.0

Verdict: SPAM! ```

Types of Naive Bayes

### 1. Gaussian Naive Bayes For continuous features. Assumes normal distribution.

```python from sklearn.naive_bayes import GaussianNB

model = GaussianNB() model.fit(X_train, y_train) predictions = model.predict(X_test) ```

### 2. Multinomial Naive Bayes For discrete counts (word frequencies).

```python from sklearn.naive_bayes import MultinomialNB

Great for text classification! model = MultinomialNB() model.fit(X_train_counts, y_train) ```

### 3. Bernoulli Naive Bayes For binary features (word present/absent).

```python from sklearn.naive_bayes import BernoulliNB

model = BernoulliNB() model.fit(X_train_binary, y_train) ```

Text Classification Example

```python from sklearn.naive_bayes import MultinomialNB from sklearn.feature_extraction.text import CountVectorizer from sklearn.model_selection import train_test_split

Sample data texts = [ "free money click now", "meeting at 3pm tomorrow", "winner free gift claim", "project deadline friday", "cheap pills online", "team lunch next week" ] labels = ['spam', 'ham', 'spam', 'ham', 'spam', 'ham']

Convert text to word counts vectorizer = CountVectorizer() X = vectorizer.fit_transform(texts)

Split X_train, X_test, y_train, y_test = train_test_split(X, labels, test_size=0.3)

Train model = MultinomialNB() model.fit(X_train, y_train)

Predict new email new_email = vectorizer.transform(["free winner prize"]) prediction = model.predict(new_email) probability = model.predict_proba(new_email)

print(f"Prediction: {prediction[0]}") print(f"Probabilities: {probability}") ```

Laplace Smoothing

What if a word never appears in spam during training? ``` P(word | spam) = 0 → Everything multiplies to 0! ```

**Solution:** Add a small count (Laplace smoothing):

```python # alpha = smoothing parameter (default 1.0) model = MultinomialNB(alpha=1.0) ```

``` P(word | spam) = (count + 1) / (total + vocabulary_size) ```

Pros and Cons

### Pros ✅ - **Very fast:** Training and prediction are quick - **Handles many features:** Scales well with high dimensions - **Works with small data:** Doesn't need much training data - **Good baseline:** Often surprisingly competitive - **Probabilistic:** Gives probability estimates

### Cons ❌ - **Independence assumption:** Often violated in practice - **Zero frequency problem:** Needs smoothing - **Continuous features:** Assumes Gaussian (may not be true)

When to Use Naive Bayes

**Great for:** - Text classification (spam, sentiment, topic) - Real-time prediction (very fast) - Multi-class problems - When you have little training data

**Less suitable for:** - Complex feature interactions - When independence assumption is badly violated

Comparison with Other Classifiers

| Aspect | Naive Bayes | Logistic Regression | SVM | |--------|-------------|--------------------| --- | | Speed | Very fast | Fast | Slower | | Training data needed | Less | Medium | More | | Feature independence | Assumes yes | No assumption | No assumption | | Interpretability | Good | Good | Poor |

Key Takeaways

1. **Based on Bayes' theorem** - calculates P(class|features) 2. **"Naive" assumption** - features are independent (often wrong, still works!) 3. **Three types:** Gaussian, Multinomial, Bernoulli 4. **Great for text** - spam filtering, sentiment analysis 5. **Fast and simple** - excellent baseline model

Despite its simplicity, Naive Bayes often performs surprisingly well, especially for text classification. Always try it as a baseline!

Naive Bayes Classifier Explained

Naive Bayes Classifier Explained

The Intuition

Bayes' Theorem

The "Naive" Assumption

Example: Spam Classification

Types of Naive Bayes

Great for text classification! model = MultinomialNB() model.fit(X_train_counts, y_train) ```

Text Classification Example

Sample data texts = [ "free money click now", "meeting at 3pm tomorrow", "winner free gift claim", "project deadline friday", "cheap pills online", "team lunch next week" ] labels = ['spam', 'ham', 'spam', 'ham', 'spam', 'ham']

Convert text to word counts vectorizer = CountVectorizer() X = vectorizer.fit_transform(texts)

Split X_train, X_test, y_train, y_test = train_test_split(X, labels, test_size=0.3)

Train model = MultinomialNB() model.fit(X_train, y_train)

Predict new email new_email = vectorizer.transform(["free winner prize"]) prediction = model.predict(new_email) probability = model.predict_proba(new_email)

Laplace Smoothing

Pros and Cons

When to Use Naive Bayes

Comparison with Other Classifiers

Key Takeaways

More on ML

What is Machine Learning? A Simple Introduction

Supervised vs Unsupervised Learning Explained

Understanding Training, Validation, and Test Sets