ML10 min read

Recurrent Neural Networks (RNNs) for Sequences

Understand how RNNs process sequential data, their architecture, and common applications in text and time series.

Sarah Chen
December 19, 2025
0.0k0

Recurrent Neural Networks (RNNs) for Sequences

Regular neural networks see each input independently. RNNs have memory - they remember what came before.

Why Sequences Need Special Treatment

Consider predicting the next word:

"The cat sat on the ___"

To predict correctly, you need context from previous words. Regular networks can't do this naturally.

How RNNs Work

RNNs have a loop that passes information from one step to the next:

       ┌──────────────────────────┐
       │                          │
       v                          │
   [Hidden State] ──────> [Hidden State]
       ^                          ^
       │                          │
   [Input t]               [Input t+1]

At each time step:

  1. Take current input
  2. Combine with previous hidden state
  3. Produce output and new hidden state

Simple RNN Implementation

import numpy as np

class SimpleRNN:
    def __init__(self, input_size, hidden_size, output_size):
        # Initialize weights
        self.Wxh = np.random.randn(hidden_size, input_size) * 0.01
        self.Whh = np.random.randn(hidden_size, hidden_size) * 0.01
        self.Why = np.random.randn(output_size, hidden_size) * 0.01
        self.bh = np.zeros((hidden_size, 1))
        self.by = np.zeros((output_size, 1))
    
    def forward(self, inputs, h_prev):
        h = np.tanh(self.Wxh @ inputs + self.Whh @ h_prev + self.bh)
        y = self.Why @ h + self.by
        return y, h

Using Keras for RNNs

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import SimpleRNN, Dense, Embedding

# Text classification with RNN
model = Sequential([
    Embedding(vocab_size, 128, input_length=max_length),
    SimpleRNN(64, return_sequences=False),
    Dense(1, activation='sigmoid')
])

model.compile(
    optimizer='adam',
    loss='binary_crossentropy',
    metrics=['accuracy']
)

model.fit(X_train, y_train, epochs=10, batch_size=32)

The Vanishing Gradient Problem

Simple RNNs struggle with long sequences. During backpropagation, gradients either:

  • Vanish: Become tiny, learning stops
  • Explode: Become huge, training unstable

This limits how far back RNNs can remember. Solution? LSTM and GRU (covered next).

Common Applications

Task Description
Text Classification Sentiment analysis, spam detection
Language Modeling Predict next word
Machine Translation Sequence-to-sequence
Time Series Stock prices, weather
Speech Recognition Audio to text

Bidirectional RNNs

Sometimes you need context from both directions:

"I love this ___. It's so ___."

from tensorflow.keras.layers import Bidirectional

model = Sequential([
    Embedding(vocab_size, 128),
    Bidirectional(SimpleRNN(64)),  # Processes forward and backward
    Dense(1, activation='sigmoid')
])

Key Takeaway

RNNs process sequences by maintaining hidden state that passes information across time steps. They're foundational for text, time series, and any sequential data. However, vanilla RNNs struggle with long sequences due to vanishing gradients. Use LSTM or GRU for better long-term memory.

#Machine Learning#Deep Learning#RNN#Sequences#Advanced