Web Scraping28 min read
Retries and Error Handling
Learn how to handle network errors, timeouts, and temporary failures using retries so your scraper becomes stable and reliable.
David Miller
December 29, 2025
1.0k47
Real networks are unreliable.
Requests may fail because of:
- slow servers
- connection drops
- temporary blocks
- timeouts
A good scraper must not crash on first failure.
Common errors
- Timeout
- ConnectionError
- 5xx server errors
- 429 Too Many Requests
Basic try-except
import requests
try:
res = requests.get("https://example.com", timeout=10)
res.raise_for_status()
except requests.exceptions.RequestException as e:
print("Request failed:", e)
Retry logic (simple loop)
import time
import requests
def fetch_with_retry(url, retries=3, delay=3):
for attempt in range(1, retries+1):
try:
res = requests.get(url, timeout=10)
res.raise_for_status()
return res.text
except Exception as e:
print(f"Attempt {attempt} failed:", e)
time.sleep(delay)
return None
html = fetch_with_retry("https://example.com")
Smarter retries with backoff
import time
import requests
def fetch(url, retries=4):
delay = 2
for i in range(retries):
try:
return requests.get(url, timeout=10).text
except:
time.sleep(delay)
delay *= 2
Graph: retry flow
flowchart TD
A[Request] --> B{Success?}
B -->|Yes| C[Return]
B -->|No| D[Wait]
D --> A
Remember
- Always set timeouts
- Retry only temporary errors
- Log failures for debugging
#Python#Advanced#Reliability