Handling Errors and Retries
Make your scraper stable by handling network errors, timeouts, missing elements, and retrying failed requests.
Real web scraping always faces errors: - network issues - server blocks - missing elements - broken HTML A good scraper must not crash. ## Handle request errors ```python import requests try: res = requests.get("https://example.com", timeout=10) res.raise_for_status() except requests.exceptions.RequestException as e: print("Request failed:", e) ``` --- ## Safe element access ```python title = soup.find("h1") if title: print(title.text) else: print("Title not found") ``` --- ## Retry logic ```python import time, requests def fetch(url, retries=3): for i in range(retries): try: res = requests.get(url, timeout=10) res.raise_for_status() return res except Exception as e: print("Retry", i+1, "failed") time.sleep(2) return None res = fetch("https://example.com") ``` --- ## Graph: retry flow ```mermaid flowchart TD A[Request] --> B{Success?} B -->|Yes| C[Return data] B -->|No| D[Wait] D --> E[Retry] E --> B ``` ## Remember - Never assume request will succeed - Always check elements - Retries make scraper robust