Understand how RNNs process sequential data, their architecture, and common applications in text and time series.

Recurrent Neural Networks (RNNs) for Sequences

Regular neural networks see each input independently. RNNs have memory - they remember what came before.

Why Sequences Need Special Treatment

Consider predicting the next word:

"The cat sat on the ___"

To predict correctly, you need context from previous words. Regular networks can't do this naturally.

How RNNs Work

RNNs have a loop that passes information from one step to the next:

``` ┌──────────────────────────┐ │ │ v │ [Hidden State] ──────> [Hidden State] ^ ^ │ │ [Input t] [Input t+1] ```

At each time step: 1. Take current input 2. Combine with previous hidden state 3. Produce output and new hidden state

Simple RNN Implementation

```python import numpy as np

class SimpleRNN: def __init__(self, input_size, hidden_size, output_size): # Initialize weights self.Wxh = np.random.randn(hidden_size, input_size) * 0.01 self.Whh = np.random.randn(hidden_size, hidden_size) * 0.01 self.Why = np.random.randn(output_size, hidden_size) * 0.01 self.bh = np.zeros((hidden_size, 1)) self.by = np.zeros((output_size, 1)) def forward(self, inputs, h_prev): h = np.tanh(self.Wxh @ inputs + self.Whh @ h_prev + self.bh) y = self.Why @ h + self.by return y, h ```

Using Keras for RNNs

```python from tensorflow.keras.models import Sequential from tensorflow.keras.layers import SimpleRNN, Dense, Embedding

Text classification with RNN model = Sequential([ Embedding(vocab_size, 128, input_length=max_length), SimpleRNN(64, return_sequences=False), Dense(1, activation='sigmoid') ])

model.compile( optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'] )

model.fit(X_train, y_train, epochs=10, batch_size=32) ```

The Vanishing Gradient Problem

Simple RNNs struggle with long sequences. During backpropagation, gradients either: - **Vanish:** Become tiny, learning stops - **Explode:** Become huge, training unstable

This limits how far back RNNs can remember. Solution? LSTM and GRU (covered next).

Common Applications

| Task | Description | |------|-------------| | Text Classification | Sentiment analysis, spam detection | | Language Modeling | Predict next word | | Machine Translation | Sequence-to-sequence | | Time Series | Stock prices, weather | | Speech Recognition | Audio to text |

Bidirectional RNNs

Sometimes you need context from both directions:

"I love this ___. It's so ___."

```python from tensorflow.keras.layers import Bidirectional

model = Sequential([ Embedding(vocab_size, 128), Bidirectional(SimpleRNN(64)), # Processes forward and backward Dense(1, activation='sigmoid') ]) ```

Key Takeaway

RNNs process sequences by maintaining hidden state that passes information across time steps. They're foundational for text, time series, and any sequential data. However, vanilla RNNs struggle with long sequences due to vanishing gradients. Use LSTM or GRU for better long-term memory.

Recurrent Neural Networks (RNNs) for Sequences

Recurrent Neural Networks (RNNs) for Sequences

Why Sequences Need Special Treatment

How RNNs Work

Simple RNN Implementation

Using Keras for RNNs

Text classification with RNN model = Sequential([ Embedding(vocab_size, 128, input_length=max_length), SimpleRNN(64, return_sequences=False), Dense(1, activation='sigmoid') ])

The Vanishing Gradient Problem

Common Applications

Bidirectional RNNs

Key Takeaway

More on ML

What is Machine Learning? A Simple Introduction

Supervised vs Unsupervised Learning Explained

Understanding Training, Validation, and Test Sets