Web Scraping20 min read

How Websites Work

Understand how websites send HTML to your browser so you know what exactly your scraper is downloading and reading.

David Miller
December 21, 2025
0.0k0

Before scraping, you must understand how a website works. A website is not magic. It is just: - a server - sending text (HTML) - to your browser. Your scraper does the same. ## What happens when you open a site 1) You enter a URL 2) Browser sends request 3) Server responds with HTML 4) Browser renders page Your scraper will stop at step 3 and read HTML. ## What is HTML HTML is a text document with tags: ```html <h1>News</h1> <p>This is a paragraph</p> <a href="/jobs">Jobs</a> ``` Tags describe structure, not data meaning. ## Key parts for scraping - Tags: h1, p, div, span, a - Attributes: class, id, href - Text inside tags ## Static vs Dynamic websites ### Static HTML already contains data. Easy to scrape. ### Dynamic HTML loads empty, data comes later via JavaScript. Harder to scrape, needs browser automation. ## Graph: static vs dynamic ```mermaid flowchart TD A[Request Page] --> B{Type?} B -->|Static| C[HTML has data] B -->|Dynamic| D[JS loads data later] ``` ## Developer tools (your best friend) In browser: - Right click → Inspect - See HTML structure - Find tags and classes This is how you decide what to scrape. ## Remember - Scraper reads HTML, not visuals - Learn to inspect elements - Static sites are easier - Dynamic sites need extra tools

#Python#Beginner#Web Basics