Scraping APIs Directly
Learn how many modern websites load data from APIs and how to scrape those APIs instead of parsing HTML.
Modern sites often load data from hidden APIs. Instead of scraping HTML: you can call the API directly. This is: - faster - cleaner - more stable --- ## How to find APIs Open DevTools → Network → XHR/Fetch. Look for: - JSON responses - endpoints like /api/items --- ## Example API request ```python import requests url = "https://example.com/api/products" res = requests.get(url) data = res.json() for p in data["items"]: print(p["name"], p["price"]) ``` --- ## With headers ```python headers = { "User-Agent": "Mozilla/5.0", "Accept": "application/json" } res = requests.get(url, headers=headers) ``` --- ## Graph: API vs HTML ```mermaid flowchart LR A[Browser] --> B[API Call] B --> C[JSON Data] C --> D[Render HTML] E[Scraper] --> B ``` ## Remember - Always check Network tab - API scraping is best when available - JSON is easier than HTML