Data Structures18 min read
Hashing and Lookups
Understand how dict and set feel 'instant': hashing basics, collisions, and practical rules for safe keys and good performance.
David Miller
November 5, 2025
2.4k113
dict and set are fast because they use hashing.
What is hashing?
A hash function converts a value into a number (hash).
That hash helps decide where the item belongs.
Basic idea (not math-heavy)
- compute hash(key)
- go to bucket
- store/retrieve quickly
Example: dict lookup
users = {"tom": 101, "sarah": 102}
print(users["tom"])
This feels instant even if dict is large (average case).
Collisions (simple)
Sometimes two keys produce same bucket.
Python handles this internally.
Keys must be hashable
Hashable means: value does not change.
Good keys:
- str, int, tuple (if inner values are hashable)
Bad keys:
- list, dict, set (mutable)
good = {("a", 1): "ok"} # tuple key works
# bad = {[1,2]: "no"} # list key fails
Graph: hashing mental model
flowchart LR
A[key] --> B[hash()]
B --> C[bucket]
C --> D[value stored]
Remember
- dict/set are fast due to hashing (average case)
- keys must be immutable (hashable)
- collisions exist but Python manages them
#Python#Intermediate#Hashing