Saving Scraped Data

Learn how to store scraped data properly in CSV and JSON files so it can be reused for analysis, reports, or databases.

Scraping is useless if you don’t save data. Your goal is not just to print results, but to: - store them - reuse later - analyze - share ## Why saving matters If you scrape: - today: 100 products - tomorrow: prices change You want history. ## Common formats - CSV: simple tables (Excel friendly) - JSON: structured/nested data (APIs, configs) --- ## Save to CSV CSV = comma separated values. ```python import csv data = [ ["Name", "Price"], ["Apple", 120], ["Banana", 80], ] with open("products.csv", "w", newline="", encoding="utf-8") as f: writer = csv.writer(f) writer.writerows(data) ``` You can open this file in Excel. --- ## Save scraped data to CSV ```python import requests, csv from bs4 import BeautifulSoup res = requests.get("https://example.com") soup = BeautifulSoup(res.text, "html.parser") rows = [] rows.append(["Text", "Link"]) for a in soup.find_all("a"): rows.append([a.text.strip(), a.get("href")]) with open("links.csv", "w", newline="", encoding="utf-8") as f: writer = csv.writer(f) writer.writerows(rows) ``` --- ## Save to JSON JSON is great for nested data. ```python import json data = [ {"name": "Apple", "price": 120}, {"name": "Banana", "price": 80} ] with open("products.json", "w", encoding="utf-8") as f: json.dump(data, f, indent=2) ``` --- ## Graph: scraping to storage ```mermaid flowchart LR A[Scraper] --> B[Extracted Data] B --> C[CSV File] B --> D[JSON File] ``` ## Remember - Always save data, not just print - CSV for tables - JSON for structured data - Use utf-8 encoding

More on Web Scraping

Web Scraping Intro

How Websites Work

HTTP Requests Basics