Web Scraping24 min read
HTML Parsing with BeautifulSoup
Learn how to parse HTML and extract data using BeautifulSoup with clear searching patterns and examples.
David Miller
November 19, 2025
4.7k194
After downloading HTML, you must parse it.
BeautifulSoup helps you:
- read HTML
- navigate tags
- extract text and attributes
Install
pip install beautifulsoup4
Basic usage
from bs4 import BeautifulSoup
html = "<h1>Title</h1><p>Text</p>"
soup = BeautifulSoup(html, "html.parser")
print(soup.h1.text)
print(soup.p.text)
Parse real page
import requests
from bs4 import BeautifulSoup
res = requests.get("https://example.com")
soup = BeautifulSoup(res.text, "html.parser")
Find elements
soup.find("h1")
soup.find_all("p")
soup.find("div", class_="news")
Extract attributes
link = soup.find("a")
print(link["href"])
Loop through items
for p in soup.find_all("p"):
print(p.text)
Graph: parsing flow
flowchart LR
A[HTML Text] --> B[BeautifulSoup]
B --> C[Search Tags]
C --> D[Extract Data]
Remember
- BeautifulSoup builds a tree from HTML
- Use find / find_all
- .text gets text
- ["attr"] gets attribute
#Python#Beginner#BeautifulSoup