Retries and Error Handling
Learn how to handle network errors, timeouts, and temporary failures using retries so your scraper becomes stable and reliable.
Real networks are unreliable.
Requests may fail because of: - slow servers - connection drops - temporary blocks - timeouts
A good scraper must not crash on first failure.
Common errors - Timeout - ConnectionError - 5xx server errors - 429 Too Many Requests
Basic try-except ```python import requests
try: res = requests.get("https://example.com", timeout=10) res.raise_for_status() except requests.exceptions.RequestException as e: print("Request failed:", e) ```
Retry logic (simple loop) ```python import time import requests
def fetch_with_retry(url, retries=3, delay=3): for attempt in range(1, retries+1): try: res = requests.get(url, timeout=10) res.raise_for_status() return res.text except Exception as e: print(f"Attempt {attempt} failed:", e) time.sleep(delay) return None
html = fetch_with_retry("https://example.com") ```
Smarter retries with backoff ```python import time import requests
def fetch(url, retries=4): delay = 2 for i in range(retries): try: return requests.get(url, timeout=10).text except: time.sleep(delay) delay *= 2 ```