Python22 min read
Python Performance Optimization
Write faster Python by learning where time is spent, how to measure it, and which optimizations give real wins (without making code messy).
David Miller
July 21, 2025
8.1k324
Performance optimization means: same output, less time and less memory.
But the golden rule is:
✅ **Measure first, optimize second.**
Because sometimes the “slow part” you guess is not the real slow part.
## What you will learn in this lesson
- Why Python code becomes slow
- How to find the slow lines (profiling)
- High-impact optimizations that keep code readable
- When you should use libraries like NumPy
- Common performance mistakes beginners make
## The performance mindset (simple)
Think of performance like this:
1) **Algorithm matters most** (O(n) vs O(n²))
2) **Data structures matter** (set membership vs list membership)
3) **Micro-optimizations matter last** (local variables, tiny tweaks)
## Step 1: Profile before optimizing (most important)
### Time a function quickly
```python
import time
def work():
return sum(i*i for i in range(1000000))
start = time.time()
work()
end = time.time()
print("Seconds:", end - start)
```
### Use cProfile for real profiling
```python
import cProfile
def my_function():
result = sum(i ** 2 for i in range(10000))
return result
cProfile.run("my_function()")
```
This tells you:
- which function is slow
- how many calls happened
- how much time each part used
## Step 2: Prefer built-in functions (they are optimized in C)
### Example: sum is faster than manual loop
```python
# Slow
total = 0
for i in range(1000):
total += i
# Fast
total = sum(range(1000))
```
Reason: built-ins run in optimized C code internally.
## Step 3: List comprehension vs loop (often faster and cleaner)
```python
# Slower
squares = []
for i in range(1000):
squares.append(i ** 2)
# Faster (and readable)
squares = [i ** 2 for i in range(1000)]
```
### But don’t overdo it
If logic becomes confusing, prefer a normal loop.
## Step 4: Use sets for membership checks
This is a very common real-world optimization.
```python
# Slow (searching takes O(n))
items_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
print(5 in items_list)
# Fast (searching takes ~O(1))
items_set = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
print(5 in items_set)
```
When you need “contains?” many times, sets can be a huge win.
## Step 5: Avoid repeated lookups in tight loops
```python
import math
# Slow: attribute lookup each time
def slow():
for i in range(1000):
result = math.sqrt(i)
# Faster: local reference
def fast():
sqrt = math.sqrt
for i in range(1000):
result = sqrt(i)
```
This is a **micro-optimization**, use it only in hot loops.
## Step 6: Choose the right algorithm (biggest wins)
Example:
- Sorting once then binary searching can be faster than repeated linear searches.
- Using dict to map IDs is faster than scanning a list.
## Step 7: Use NumPy for heavy numeric work
Python loops are slow for large numeric arrays.
NumPy is extremely fast because it uses optimized native code.
```python
import numpy as np
arr = np.array([1, 2, 3, 4, 5])
result = arr * 2
print(result)
```
## Graph: performance decision guide
```mermaid
flowchart TD
A[Code is slow] --> B[Profile first]
B --> C{What is slow?}
C -->|Algorithm/data structure| D[Change approach: set/dict, better algorithm]
C -->|I/O waiting| E[Use async or batching]
C -->|Numeric heavy| F[Use NumPy / vectorization]
C -->|Small hotspot loop| G[Micro-optimizations]
```
## Practical checklist (use in real projects)
- Measure with cProfile or simple timers
- Optimize algorithm first
- Replace list membership with set when needed
- Prefer built-ins and comprehensions
- Avoid unnecessary work inside loops
- Use NumPy for large numeric computations
## Remember
- Profile before optimizing
- Algorithm choices give the biggest wins
- Built-ins and sets are common easy wins
- Keep code readable and maintainable
#Python#Advanced#Performance