Reuse powerful models.

What is Transfer Learning?

Using models trained on huge datasets for your specific task.

Like learning piano after learning keyboard - transfer skills!

Why Transfer Learning?

**Problems it solves**: - Don't have millions of images - Can't afford weeks of training - Limited GPU resources

Pre-trained Models

**Vision**: - VGG16, ResNet50, InceptionV3 - Trained on ImageNet (1.4M images, 1000 classes)

**Language**: - BERT, GPT, RoBERTa - Trained on billions of words

Using Pre-trained Model

```python from tensorflow.keras.applications import VGG16 from tensorflow.keras.models import Model from tensorflow.keras.layers import Dense, GlobalAveragePooling2D

Load pre-trained model (without top layer) base_model = VGG16(weights='imagenet', include_top=False)

Freeze base layers for layer in base_model.layers: layer.trainable = False

Add custom layers x = base_model.output x = GlobalAveragePooling2D()(x) x = Dense(256, activation='relu')(x) predictions = Dense(10, activation='softmax')(x) # 10 classes

model = Model(inputs=base_model.input, outputs=predictions) ```

Fine-Tuning Strategy

**Step 1**: Train only top layers ```python # Freeze all base layers for layer in base_model.layers: layer.trainable = False

model.compile(optimizer='adam', loss='categorical_crossentropy') model.fit(X_train, y_train, epochs=5) ```

**Step 2**: Unfreeze some layers ```python # Unfreeze last 10 layers for layer in base_model.layers[-10:]: layer.trainable = True

Use lower learning rate model.compile(optimizer=Adam(lr=0.0001), loss='categorical_crossentropy') model.fit(X_train, y_train, epochs=10) ```

Real Example - Medical Images

```python # Classify X-rays with only 1000 images from tensorflow.keras.applications import ResNet50

base = ResNet50(weights='imagenet', include_top=False, input_shape=(224, 224, 3))

Freeze most layers for layer in base.layers[:-20]: layer.trainable = False

Custom classifier x = GlobalAveragePooling2D()(base.output) x = Dense(128, activation='relu')(x) x = Dropout(0.5)(x) output = Dense(2, activation='softmax')(x) # Normal vs Pneumonia

model = Model(base.input, output) model.compile(optimizer=Adam(1e-4), loss='categorical_crossentropy', metrics=['accuracy'])

Train on small dataset model.fit(X_train, y_train, epochs=20, validation_data=(X_val, y_val)) ```

Transfer Learning for NLP

```python from transformers import BertTokenizer, TFBertForSequenceClassification import tensorflow as tf

Load pre-trained BERT model = TFBertForSequenceClassification.from_pretrained('bert-base-uncased', num_labels=2) tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')

Tokenize text texts = ["Great product!", "Terrible service"] encodings = tokenizer(texts, truncation=True, padding=True, return_tensors='tf')

Fine-tune on your data model.compile( optimizer=tf.keras.optimizers.Adam(1e-5), loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True), metrics=['accuracy'] )

model.fit(encodings['input_ids'], labels, epochs=3) ```

Feature Extraction vs Fine-Tuning

**Feature Extraction**: - Freeze all layers - Use as feature extractor - Fast, less data needed

**Fine-Tuning**: - Unfreeze some layers - Adapt to your data - Better performance, needs more data

Domain Adaptation

When source and target differ:

```python # Trained on photos, using on sketches # Use domain adaptation techniques

from tensorflow.keras.layers import Lambda

Gradient reversal for domain adaptation def gradient_reversal(x): return x * -1

Add domain classifier domain_output = Lambda(gradient_reversal)(shared_features) domain_output = Dense(1, activation='sigmoid')(domain_output) ```

Best Practices

1. Start with frozen layers 2. Use small learning rate when fine-tuning 3. Augment data heavily 4. Monitor validation loss closely 5. Try different pre-trained models

Remember

- Transfer learning saves time and resources - Always start with pre-trained models - Fine-tune carefully with low learning rate - Works amazingly well with small datasets

Transfer Learning