Web Scraping38 min read

Store Scraped Data in Database

Learn to design tables and store scraped data directly into a database using SQLite and PostgreSQL so scraping becomes production-ready.

David Miller
December 15, 2025
1.0k26

In real projects, scraped data is stored in databases.

Why databases:

  • handle large data
  • fast search and filters
  • avoid duplicates
  • multi-user access
  • long-term storage

This lesson teaches:

  • table design
  • insert/update logic
  • duplicate handling

Example use case

Scrape products:

  • name
  • price
  • url
  • scraped_at

Step 1: Design table

CREATE TABLE IF NOT EXISTS products (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  name TEXT,
  price REAL,
  url TEXT UNIQUE,
  scraped_at TEXT
);

Key idea:

  • url is UNIQUE to avoid duplicates.

Step 2: Connect to SQLite

import sqlite3

conn = sqlite3.connect("scraped.db")
cur = conn.cursor()

Step 3: Insert scraped item

from datetime import datetime

def save_product(name, price, url):
    cur.execute("""
        INSERT OR IGNORE INTO products (name, price, url, scraped_at)
        VALUES (?, ?, ?, ?)
    """, (name, price, url, datetime.utcnow().isoformat()))
    conn.commit()

Step 4: Use in scraper

product = {"name": "Book A", "price": 10, "url": "http://x.com/a"}
save_product(product["name"], product["price"], product["url"])

Graph: scraper to database

flowchart LR
  A[Scraper] --> B[Parsed Data]
  B --> C[SQL Insert]
  C --> D[Database]

PostgreSQL note

For large systems use psycopg2 / asyncpg.
Logic stays same: connect → insert → commit.


Remember

  • Always design schema first
  • Use UNIQUE keys to avoid duplicates
  • Insert as you scrape, not after everything
#Python#Advanced#Database