Web Scraping34 min read
Avoiding Blocks and Detection
Understand how websites detect scrapers and learn ethical ways to reduce blocking using headers, delays, and session behavior.
David Miller
December 19, 2025
2.3k66
Most modern sites try to detect bots.
If detected:
- 403 errors
- CAPTCHA
- IP ban
- blocked pages
Goal:
Behave like a normal user.
Use real headers
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64)",
"Accept-Language": "en-US,en;q=0.9"
}
requests.get("https://example.com", headers=headers)
Keep a session
import requests
session = requests.Session()
session.headers.update(headers)
res = session.get("https://example.com")
Why:
- keeps cookies
- looks more human
Add random delays
import time, random
time.sleep(random.uniform(2, 5))
Avoid patterns
Bad:
- same delay every time
- same URL order
- too many requests per second
Graph: detection logic
flowchart TD
A[Bot Requests] --> B{Looks human?}
B -->|Yes| C[Allow]
B -->|No| D[Block]
Remember
- Always respect site rules
- Slow and steady wins
- Never try to bypass serious protections
#Python#Advanced#AntiBlock