Introduction to Neural Networks
Understand the basics of neural networks - neurons, layers, activation functions, and how they learn.
Introduction to Neural Networks
Neural networks are inspired by the brain but don't think of them as actual brains. They're mathematical functions that learn patterns through examples.
The Building Block: Neuron
A single neuron does:
- Takes inputs (x₁, x₂, ...)
- Multiplies each by a weight (w₁, w₂, ...)
- Adds them up plus a bias (b)
- Applies an activation function
output = activation(w₁x₁ + w₂x₂ + ... + b)
Network Architecture
Neurons are organized in layers:
Input Layer → Hidden Layer(s) → Output Layer
● ● ●
● → ● → ●
● ●
- Input Layer: Your features
- Hidden Layers: Where learning happens
- Output Layer: Predictions
Activation Functions
Without activation functions, neural networks are just linear regression. Activation adds non-linearity.
Common Activations:
| Function | Formula | Use |
|---|---|---|
| ReLU | max(0, x) | Hidden layers (default) |
| Sigmoid | 1/(1+e⁻ˣ) | Binary output (0-1) |
| Softmax | eˣ/Σeˣ | Multi-class output |
| Tanh | (eˣ-e⁻ˣ)/(eˣ+e⁻ˣ) | Hidden layers |
How Networks Learn
- Forward Pass: Input flows through, get prediction
- Calculate Loss: How wrong was the prediction?
- Backward Pass: Calculate how each weight contributed to error
- Update Weights: Adjust to reduce error
- Repeat: Many times over many examples
This is called backpropagation with gradient descent.
Simple Implementation
from sklearn.neural_network import MLPClassifier
# Create neural network
nn = MLPClassifier(
hidden_layer_sizes=(100, 50), # Two hidden layers
activation='relu',
max_iter=500,
random_state=42
)
nn.fit(X_train, y_train)
accuracy = nn.score(X_test, y_test)
With PyTorch (More Control)
import torch
import torch.nn as nn
class SimpleNet(nn.Module):
def __init__(self, input_size, num_classes):
super().__init__()
self.layers = nn.Sequential(
nn.Linear(input_size, 100),
nn.ReLU(),
nn.Linear(100, 50),
nn.ReLU(),
nn.Linear(50, num_classes)
)
def forward(self, x):
return self.layers(x)
Key Concepts
Epochs: One pass through entire training data
Batch Size: Number of samples before weight update
Learning Rate: How big the weight updates are
When to Use Neural Networks
Good for:
- Image, text, audio data
- Large datasets
- Complex patterns
- When you have GPU
Consider alternatives when:
- Small datasets (will overfit)
- Tabular data (tree methods often better)
- Interpretability needed
- Limited compute
Key Takeaway
Neural networks learn by adjusting weights to minimize prediction error. Start simple (1-2 hidden layers), use ReLU activation, and make sure you have enough data. For tabular data, try gradient boosting first - neural networks shine on unstructured data like images and text.