Handling Updates and Duplicates
Learn how to update existing rows, avoid duplicates, and keep your scraped database fresh using upserts.
Scraping is often repeated daily or hourly.
So you must: - avoid inserting same data again - update prices or fields if changed
This is called upsert (update or insert).
---
Problem Same product appears again with new price.
---
SQLite upsert example
```python def upsert_product(name, price, url): cur.execute(""" INSERT INTO products (name, price, url, scraped_at) VALUES (?, ?, ?, datetime('now')) ON CONFLICT(url) DO UPDATE SET name = excluded.name, price = excluded.price, scraped_at = excluded.scraped_at """, (name, price, url)) conn.commit() ```
---
Use in loop
```python for p in products: upsert_product(p["name"], p["price"], p["url"]) ```
---
Why this matters - keeps data fresh - no duplicates - safe repeated runs
---
Graph: upsert logic
```mermaid flowchart TD A[New Item] --> B{URL exists?} B -->|No| C[Insert] B -->|Yes| D[Update] ```
---