Autoencoders: Learning Compressed Representations

Autoencoders learn to compress data into a smaller representation, then reconstruct it. This "bottleneck" forces them to learn the most important features.

The Architecture

Input ──> [Encoder] ──> Latent Code ──> [Decoder] ──> Reconstruction
(784)      (256→64)       (32)         (64→256)        (784)

The encoder compresses, the decoder reconstructs. The latent code captures the essence.

Simple Autoencoder

from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.models import Model

# Encoder
input_layer = Input(shape=(784,))
encoded = Dense(256, activation='relu')(input_layer)
encoded = Dense(64, activation='relu')(encoded)
latent = Dense(32, activation='relu')(encoded)  # Bottleneck

# Decoder
decoded = Dense(64, activation='relu')(latent)
decoded = Dense(256, activation='relu')(decoded)
output = Dense(784, activation='sigmoid')(decoded)

# Full autoencoder
autoencoder = Model(input_layer, output)

# Just the encoder
encoder = Model(input_layer, latent)

autoencoder.compile(optimizer='adam', loss='mse')
autoencoder.fit(X_train, X_train, epochs=50, batch_size=256)  # Input = Target

Convolutional Autoencoder

For images, use convolutions:

from tensorflow.keras.layers import Conv2D, MaxPooling2D, UpSampling2D

# Encoder
input_img = Input(shape=(28, 28, 1))
x = Conv2D(32, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(16, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)

# Decoder
x = Conv2D(16, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(32, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)

autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adam', loss='binary_crossentropy')

Denoising Autoencoder

Train on noisy input, reconstruct clean output:

import numpy as np

# Add noise to training data
noise_factor = 0.3
X_train_noisy = X_train + noise_factor * np.random.normal(size=X_train.shape)
X_train_noisy = np.clip(X_train_noisy, 0., 1.)

# Train to reconstruct clean images from noisy ones
autoencoder.fit(
    X_train_noisy, X_train,  # Noisy input, clean target
    epochs=50,
    batch_size=256,
    validation_data=(X_test_noisy, X_test)
)

Variational Autoencoder (VAE)

VAE learns a probability distribution in latent space, enabling generation:

from tensorflow.keras.layers import Lambda
import tensorflow.keras.backend as K

# Sampling function
def sampling(args):
    z_mean, z_log_var = args
    epsilon = K.random_normal(shape=(K.shape(z_mean)[0], latent_dim))
    return z_mean + K.exp(0.5 * z_log_var) * epsilon

# Encoder outputs mean and variance
z_mean = Dense(latent_dim)(encoded)
z_log_var = Dense(latent_dim)(encoded)

# Sample from distribution
z = Lambda(sampling)([z_mean, z_log_var])

# VAE loss = reconstruction loss + KL divergence
reconstruction_loss = K.mean(K.square(input_layer - output))
kl_loss = -0.5 * K.mean(1 + z_log_var - K.square(z_mean) - K.exp(z_log_var))
vae_loss = reconstruction_loss + kl_loss

Applications

Application	How It Works
Dimensionality Reduction	Use encoder output as features
Denoising	Train on noisy data
Anomaly Detection	High reconstruction error = anomaly
Generation (VAE)	Sample from latent space
Image Compression	Encode → store → decode

Anomaly Detection Example

# Train on normal data only
autoencoder.fit(normal_data, normal_data, epochs=50)

# Calculate reconstruction error
reconstructions = autoencoder.predict(test_data)
errors = np.mean(np.square(test_data - reconstructions), axis=1)

# Anomalies have high error
threshold = np.percentile(errors, 95)
anomalies = errors > threshold

Key Takeaway

Autoencoders learn compressed representations by reconstructing their input through a bottleneck. They're great for dimensionality reduction, denoising, and anomaly detection. VAEs add probabilistic sampling for generation. Use them when you need to learn meaningful features in an unsupervised way.

Autoencoders: Learning Compressed Representations

Autoencoders: Learning Compressed Representations

The Architecture

Simple Autoencoder

Convolutional Autoencoder

Denoising Autoencoder

Variational Autoencoder (VAE)

Applications

Anomaly Detection Example

Key Takeaway

More on ML

Machine Learning Interview Questions: 50 Essential Questions for Developers

Feature Engineering: Making Your Data ML-Ready

Overfitting and Underfitting: The ML Balance