Data Structures18 min read

Hashing and Lookups

Understand how dict and set feel 'instant': hashing basics, collisions, and practical rules for safe keys and good performance.

David Miller
November 5, 2025
2.4k113

dict and set are fast because they use hashing.

What is hashing?

A hash function converts a value into a number (hash).
That hash helps decide where the item belongs.

Basic idea (not math-heavy)

  • compute hash(key)
  • go to bucket
  • store/retrieve quickly

Example: dict lookup

users = {"tom": 101, "sarah": 102}
print(users["tom"])

This feels instant even if dict is large (average case).

Collisions (simple)

Sometimes two keys produce same bucket.
Python handles this internally.

Keys must be hashable

Hashable means: value does not change.
Good keys:

  • str, int, tuple (if inner values are hashable)

Bad keys:

  • list, dict, set (mutable)
good = {("a", 1): "ok"}  # tuple key works
# bad = {[1,2]: "no"}     # list key fails

Graph: hashing mental model

flowchart LR
  A[key] --> B[hash()]
  B --> C[bucket]
  C --> D[value stored]

Remember

  • dict/set are fast due to hashing (average case)
  • keys must be immutable (hashable)
  • collisions exist but Python manages them
#Python#Intermediate#Hashing