Ethics and robots.txt

Learn responsible scraping: robots.txt, rate limiting, and how to avoid harming websites or getting blocked.

David Miller

December 6, 2025

3.4k152

Scraping is powerful, but it must be responsible.

What is robots.txt

Most sites provide:
site.com/robots.txt

It tells bots:

Example:

User-agent: *
Disallow: /admin

Means: do not scrape /admin.

Do not hit servers too fast.

import time

for url in urls:
    requests.get(url)
    time.sleep(2)

Use User-Agent:

headers = {
  "User-Agent": "MyScraper/1.0 (contact@example.com)"
}

flowchart TD
  A[Plan scrape] --> B[Check robots.txt]
  B --> C[Add delays]
  C --> D[Scrape responsibly]

#Python#Beginner#Ethics