AI8 min read

Transfer Learning

Use pre-trained models for your tasks.

Dr. Patricia Moore
December 18, 2025
0.0k0

Reuse powerful models.

What is Transfer Learning?

Using models trained on huge datasets for your specific task.

Like learning piano after learning keyboard - transfer skills!

Why Transfer Learning?

Problems it solves:

  • Don't have millions of images
  • Can't afford weeks of training
  • Limited GPU resources

Pre-trained Models

Vision:

  • VGG16, ResNet50, InceptionV3
  • Trained on ImageNet (1.4M images, 1000 classes)

Language:

  • BERT, GPT, RoBERTa
  • Trained on billions of words

Using Pre-trained Model

from tensorflow.keras.applications import VGG16
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, GlobalAveragePooling2D

# Load pre-trained model (without top layer)
base_model = VGG16(weights='imagenet', include_top=False)

# Freeze base layers
for layer in base_model.layers:
    layer.trainable = False

# Add custom layers
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(256, activation='relu')(x)
predictions = Dense(10, activation='softmax')(x)  # 10 classes

model = Model(inputs=base_model.input, outputs=predictions)

Fine-Tuning Strategy

Step 1: Train only top layers

# Freeze all base layers
for layer in base_model.layers:
    layer.trainable = False

model.compile(optimizer='adam', loss='categorical_crossentropy')
model.fit(X_train, y_train, epochs=5)

Step 2: Unfreeze some layers

# Unfreeze last 10 layers
for layer in base_model.layers[-10:]:
    layer.trainable = True

# Use lower learning rate
model.compile(optimizer=Adam(lr=0.0001), loss='categorical_crossentropy')
model.fit(X_train, y_train, epochs=10)

Real Example - Medical Images

# Classify X-rays with only 1000 images
from tensorflow.keras.applications import ResNet50

base = ResNet50(weights='imagenet', include_top=False, input_shape=(224, 224, 3))

# Freeze most layers
for layer in base.layers[:-20]:
    layer.trainable = False

# Custom classifier
x = GlobalAveragePooling2D()(base.output)
x = Dense(128, activation='relu')(x)
x = Dropout(0.5)(x)
output = Dense(2, activation='softmax')(x)  # Normal vs Pneumonia

model = Model(base.input, output)
model.compile(optimizer=Adam(1e-4), loss='categorical_crossentropy', metrics=['accuracy'])

# Train on small dataset
model.fit(X_train, y_train, epochs=20, validation_data=(X_val, y_val))

Transfer Learning for NLP

from transformers import BertTokenizer, TFBertForSequenceClassification
import tensorflow as tf

# Load pre-trained BERT
model = TFBertForSequenceClassification.from_pretrained('bert-base-uncased', num_labels=2)
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')

# Tokenize text
texts = ["Great product!", "Terrible service"]
encodings = tokenizer(texts, truncation=True, padding=True, return_tensors='tf')

# Fine-tune on your data
model.compile(
    optimizer=tf.keras.optimizers.Adam(1e-5),
    loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=['accuracy']
)

model.fit(encodings['input_ids'], labels, epochs=3)

Feature Extraction vs Fine-Tuning

Feature Extraction:

  • Freeze all layers
  • Use as feature extractor
  • Fast, less data needed

Fine-Tuning:

  • Unfreeze some layers
  • Adapt to your data
  • Better performance, needs more data

Domain Adaptation

When source and target differ:

# Trained on photos, using on sketches
# Use domain adaptation techniques

from tensorflow.keras.layers import Lambda

# Gradient reversal for domain adaptation
def gradient_reversal(x):
    return x * -1

# Add domain classifier
domain_output = Lambda(gradient_reversal)(shared_features)
domain_output = Dense(1, activation='sigmoid')(domain_output)

Best Practices

  1. Start with frozen layers
  2. Use small learning rate when fine-tuning
  3. Augment data heavily
  4. Monitor validation loss closely
  5. Try different pre-trained models

Remember

  • Transfer learning saves time and resources
  • Always start with pre-trained models
  • Fine-tune carefully with low learning rate
  • Works amazingly well with small datasets
#AI#Advanced#Transfer Learning