Web Scraping32 min read

Storing Scraped Data

Learn how to store scraped data properly in CSV, JSON, and databases so your scraping work becomes useful for analysis and applications.

David Miller
December 21, 2025
0.0k0

Scraping is useless if you don’t store the data properly.

Real goal: - collect - clean - store - analyze

Common storage options 1) CSV → spreadsheets, simple data 2) JSON → APIs, nested data 3) Database → large or long-term data

---

1) Store to CSV

```python import csv

rows = [ ["Name", "Price"], ["Book A", "10"], ["Book B", "15"] ]

with open("products.csv", "w", newline="", encoding="utf-8") as f: writer = csv.writer(f) writer.writerows(rows) ```

Why CSV: - easy to open in Excel - simple format - good for tables

---

2) Store to JSON

```python import json

data = [ {"name": "Book A", "price": 10}, {"name": "Book B", "price": 15} ]

with open("products.json", "w", encoding="utf-8") as f: json.dump(data, f, indent=2) ```

Why JSON: - keeps structure - good for nested data - APIs use it

---

3) Store in SQLite database

```python import sqlite3

conn = sqlite3.connect("data.db") cur = conn.cursor()

cur.execute(""" CREATE TABLE IF NOT EXISTS products ( name TEXT, price REAL ) """)

cur.execute("INSERT INTO products VALUES (?, ?)", ("Book A", 10)) cur.execute("INSERT INTO products VALUES (?, ?)", ("Book B", 15))

conn.commit() conn.close() ```

Why database: - handles large data - fast queries - avoids duplicate rows

---

Graph: data pipeline

```mermaid flowchart LR A[Scraper] --> B[Parsed Data] B --> C[CSV] B --> D[JSON] B --> E[Database] ```

Remember - Always save data in structured form - CSV for simple tables - JSON for nested records - DB for serious projects

#Python#Advanced#Storage