Recurrent Neural Networks (RNNs) for Sequences
Understand how RNNs process sequential data, their architecture, and common applications in text and time series.
Recurrent Neural Networks (RNNs) for Sequences
Regular neural networks see each input independently. RNNs have memory - they remember what came before.
Why Sequences Need Special Treatment
Consider predicting the next word:
"The cat sat on the ___"
To predict correctly, you need context from previous words. Regular networks can't do this naturally.
How RNNs Work
RNNs have a loop that passes information from one step to the next:
``` ┌──────────────────────────┐ │ │ v │ [Hidden State] ──────> [Hidden State] ^ ^ │ │ [Input t] [Input t+1] ```
At each time step: 1. Take current input 2. Combine with previous hidden state 3. Produce output and new hidden state
Simple RNN Implementation
```python import numpy as np
class SimpleRNN: def __init__(self, input_size, hidden_size, output_size): # Initialize weights self.Wxh = np.random.randn(hidden_size, input_size) * 0.01 self.Whh = np.random.randn(hidden_size, hidden_size) * 0.01 self.Why = np.random.randn(output_size, hidden_size) * 0.01 self.bh = np.zeros((hidden_size, 1)) self.by = np.zeros((output_size, 1)) def forward(self, inputs, h_prev): h = np.tanh(self.Wxh @ inputs + self.Whh @ h_prev + self.bh) y = self.Why @ h + self.by return y, h ```
Using Keras for RNNs
```python from tensorflow.keras.models import Sequential from tensorflow.keras.layers import SimpleRNN, Dense, Embedding
Text classification with RNN model = Sequential([ Embedding(vocab_size, 128, input_length=max_length), SimpleRNN(64, return_sequences=False), Dense(1, activation='sigmoid') ])
model.compile( optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'] )
model.fit(X_train, y_train, epochs=10, batch_size=32) ```
The Vanishing Gradient Problem
Simple RNNs struggle with long sequences. During backpropagation, gradients either: - **Vanish:** Become tiny, learning stops - **Explode:** Become huge, training unstable
This limits how far back RNNs can remember. Solution? LSTM and GRU (covered next).
Common Applications
| Task | Description | |------|-------------| | Text Classification | Sentiment analysis, spam detection | | Language Modeling | Predict next word | | Machine Translation | Sequence-to-sequence | | Time Series | Stock prices, weather | | Speech Recognition | Audio to text |
Bidirectional RNNs
Sometimes you need context from both directions:
"I love this ___. It's so ___."
```python from tensorflow.keras.layers import Bidirectional
model = Sequential([ Embedding(vocab_size, 128), Bidirectional(SimpleRNN(64)), # Processes forward and backward Dense(1, activation='sigmoid') ]) ```
Key Takeaway
RNNs process sequences by maintaining hidden state that passes information across time steps. They're foundational for text, time series, and any sequential data. However, vanilla RNNs struggle with long sequences due to vanishing gradients. Use LSTM or GRU for better long-term memory.