Store Scraped Data in Database
Learn to design tables and store scraped data directly into a database using SQLite and PostgreSQL so scraping becomes production-ready.
In real projects, scraped data is stored in databases.
Why databases: - handle large data - fast search and filters - avoid duplicates - multi-user access - long-term storage
This lesson teaches: - table design - insert/update logic - duplicate handling
---
Example use case Scrape products: - name - price - url - scraped_at
---
Step 1: Design table
```sql CREATE TABLE IF NOT EXISTS products ( id INTEGER PRIMARY KEY AUTOINCREMENT, name TEXT, price REAL, url TEXT UNIQUE, scraped_at TEXT ); ```
Key idea: - url is UNIQUE to avoid duplicates.
---
Step 2: Connect to SQLite
```python import sqlite3
conn = sqlite3.connect("scraped.db") cur = conn.cursor() ```
---
Step 3: Insert scraped item
```python from datetime import datetime
def save_product(name, price, url): cur.execute(""" INSERT OR IGNORE INTO products (name, price, url, scraped_at) VALUES (?, ?, ?, ?) """, (name, price, url, datetime.utcnow().isoformat())) conn.commit() ```
---
Step 4: Use in scraper
```python product = {"name": "Book A", "price": 10, "url": "http://x.com/a"} save_product(product["name"], product["price"], product["url"]) ```
---
Graph: scraper to database
```mermaid flowchart LR A[Scraper] --> B[Parsed Data] B --> C[SQL Insert] C --> D[Database] ```
---
PostgreSQL note For large systems use psycopg2 / asyncpg. Logic stays same: connect → insert → commit.
---