Web Scraping24 min read

HTML Parsing with BeautifulSoup

Learn how to parse HTML and extract data using BeautifulSoup with clear searching patterns and examples.

David Miller
November 19, 2025
4.7k194

After downloading HTML, you must parse it.

BeautifulSoup helps you:

  • read HTML
  • navigate tags
  • extract text and attributes

Install

pip install beautifulsoup4

Basic usage

from bs4 import BeautifulSoup

html = "<h1>Title</h1><p>Text</p>"
soup = BeautifulSoup(html, "html.parser")

print(soup.h1.text)
print(soup.p.text)

Parse real page

import requests
from bs4 import BeautifulSoup

res = requests.get("https://example.com")
soup = BeautifulSoup(res.text, "html.parser")

Find elements

soup.find("h1")
soup.find_all("p")
soup.find("div", class_="news")

Extract attributes

link = soup.find("a")
print(link["href"])

Loop through items

for p in soup.find_all("p"):
    print(p.text)

Graph: parsing flow

flowchart LR
  A[HTML Text] --> B[BeautifulSoup]
  B --> C[Search Tags]
  C --> D[Extract Data]

Remember

  • BeautifulSoup builds a tree from HTML
  • Use find / find_all
  • .text gets text
  • ["attr"] gets attribute
#Python#Beginner#BeautifulSoup