Your First Complete Scraper
Build your first real scraper step by step: request a page, parse HTML, extract multiple items, and print clean results.
Now we combine everything into one real scraper. Goal: scrape all links from a page. ## Step 1: Request page ```python import requests from bs4 import BeautifulSoup url = "https://example.com" res = requests.get(url) if res.status_code != 200: print("Failed") exit() ``` ## Step 2: Parse HTML ```python soup = BeautifulSoup(res.text, "html.parser") ``` ## Step 3: Find links ```python links = soup.find_all("a") ``` ## Step 4: Extract and clean ```python for a in links: text = a.text.strip() href = a.get("href") print(text, "->", href) ``` ## Full script ```python import requests from bs4 import BeautifulSoup url = "https://example.com" res = requests.get(url) if res.status_code == 200: soup = BeautifulSoup(res.text, "html.parser") for a in soup.find_all("a"): print(a.text.strip(), "->", a.get("href")) else: print("Request failed") ``` ## Graph: full scraper ```mermaid flowchart TD A[Start] --> B[Request URL] B --> C{Status 200?} C -->|Yes| D[Parse HTML] D --> E[Find tags] E --> F[Extract data] F --> G[Print/Save] C -->|No| H[Stop] ``` ## What you learned - Sending request - Parsing HTML - Finding elements - Extracting data ## Remember This pattern repeats in every scraper: Request → Parse → Find → Extract → Save