Scraping Multiple Pages
Learn how to scrape data spread across many pages using pagination patterns and loops.
Most websites show data across many pages. Example: - jobs page 1, 2, 3... - products page 1, 2, 3... You must loop through pages. ## Common pagination pattern URLs often look like: - site.com?page=1 - site.com?page=2 ## Simple loop example ```python import requests from bs4 import BeautifulSoup for page in range(1, 6): url = f"https://example.com?page={page}" res = requests.get(url) soup = BeautifulSoup(res.text, "html.parser") print("Page:", page) for h in soup.find_all("h2"): print(h.text.strip()) ``` --- ## Stop when no data found ```python page = 1 while True: url = f"https://example.com?page={page}" res = requests.get(url) soup = BeautifulSoup(res.text, "html.parser") items = soup.find_all("h2") if not items: break for h in items: print(h.text.strip()) page += 1 ``` --- ## Graph: pagination loop ```mermaid flowchart TD A[Start page=1] --> B[Request page] B --> C{Items found?} C -->|Yes| D[Extract data] D --> E[page += 1] E --> B C -->|No| F[Stop] ``` ## Remember - Many sites use page numbers - Loop until no results - Add delays to be polite