Web Scraping28 min read

Retries and Error Handling

Learn how to handle network errors, timeouts, and temporary failures using retries so your scraper becomes stable and reliable.

David Miller
December 21, 2025
0.0k0

Real networks are unreliable.

Requests may fail because of: - slow servers - connection drops - temporary blocks - timeouts

A good scraper must not crash on first failure.

Common errors - Timeout - ConnectionError - 5xx server errors - 429 Too Many Requests

Basic try-except ```python import requests

try: res = requests.get("https://example.com", timeout=10) res.raise_for_status() except requests.exceptions.RequestException as e: print("Request failed:", e) ```

Retry logic (simple loop) ```python import time import requests

def fetch_with_retry(url, retries=3, delay=3): for attempt in range(1, retries+1): try: res = requests.get(url, timeout=10) res.raise_for_status() return res.text except Exception as e: print(f"Attempt {attempt} failed:", e) time.sleep(delay) return None

html = fetch_with_retry("https://example.com") ```

Smarter retries with backoff ```python import time import requests

def fetch(url, retries=4): delay = 2 for i in range(retries): try: return requests.get(url, timeout=10).text except: time.sleep(delay) delay *= 2 ```

Graph: retry flow ```mermaid flowchart TD A[Request] --> B{Success?} B -->|Yes| C[Return] B -->|No| D[Wait] D --> A ```

Remember - Always set timeouts - Retry only temporary errors - Log failures for debugging

#Python#Advanced#Reliability