Python22 min read

Python Performance Optimization

Write faster Python by learning where time is spent, how to measure it, and which optimizations give real wins (without making code messy).

David Miller
July 21, 2025
8.1k324

Performance optimization means: same output, less time and less memory.

        But the golden rule is:
        ✅ **Measure first, optimize second.**  
        Because sometimes the “slow part” you guess is not the real slow part.
        
        ## What you will learn in this lesson
        
        - Why Python code becomes slow
        - How to find the slow lines (profiling)
        - High-impact optimizations that keep code readable
        - When you should use libraries like NumPy
        - Common performance mistakes beginners make
        
        ## The performance mindset (simple)
        
        Think of performance like this:
        
        1) **Algorithm matters most** (O(n) vs O(n²))  
        2) **Data structures matter** (set membership vs list membership)  
        3) **Micro-optimizations matter last** (local variables, tiny tweaks)
        
        ## Step 1: Profile before optimizing (most important)
        
        ### Time a function quickly
        ```python
        import time
        
        def work():
            return sum(i*i for i in range(1000000))
        
        start = time.time()
        work()
        end = time.time()
        print("Seconds:", end - start)
        ```
        
        ### Use cProfile for real profiling
        ```python
        import cProfile
        
        def my_function():
            result = sum(i ** 2 for i in range(10000))
            return result
        
        cProfile.run("my_function()")
        ```
        
        This tells you:
        - which function is slow
        - how many calls happened
        - how much time each part used
        
        ## Step 2: Prefer built-in functions (they are optimized in C)
        
        ### Example: sum is faster than manual loop
        ```python
        # Slow
        total = 0
        for i in range(1000):
            total += i
        
        # Fast
        total = sum(range(1000))
        ```
        
        Reason: built-ins run in optimized C code internally.
        
        ## Step 3: List comprehension vs loop (often faster and cleaner)
        
        ```python
        # Slower
        squares = []
        for i in range(1000):
            squares.append(i ** 2)
        
        # Faster (and readable)
        squares = [i ** 2 for i in range(1000)]
        ```
        
        ### But don’t overdo it
        If logic becomes confusing, prefer a normal loop.
        
        ## Step 4: Use sets for membership checks
        
        This is a very common real-world optimization.
        
        ```python
        # Slow (searching takes O(n))
        items_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
        print(5 in items_list)
        
        # Fast (searching takes ~O(1))
        items_set = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
        print(5 in items_set)
        ```
        
        When you need “contains?” many times, sets can be a huge win.
        
        ## Step 5: Avoid repeated lookups in tight loops
        
        ```python
        import math
        
        # Slow: attribute lookup each time
        def slow():
            for i in range(1000):
                result = math.sqrt(i)
        
        # Faster: local reference
        def fast():
            sqrt = math.sqrt
            for i in range(1000):
                result = sqrt(i)
        ```
        
        This is a **micro-optimization**, use it only in hot loops.
        
        ## Step 6: Choose the right algorithm (biggest wins)
        
        Example:
        - Sorting once then binary searching can be faster than repeated linear searches.
        - Using dict to map IDs is faster than scanning a list.
        
        ## Step 7: Use NumPy for heavy numeric work
        
        Python loops are slow for large numeric arrays.
        NumPy is extremely fast because it uses optimized native code.
        
        ```python
        import numpy as np
        
        arr = np.array([1, 2, 3, 4, 5])
        result = arr * 2
        print(result)
        ```
        
        ## Graph: performance decision guide
        
        ```mermaid
        flowchart TD
          A[Code is slow] --> B[Profile first]
          B --> C{What is slow?}
          C -->|Algorithm/data structure| D[Change approach: set/dict, better algorithm]
          C -->|I/O waiting| E[Use async or batching]
          C -->|Numeric heavy| F[Use NumPy / vectorization]
          C -->|Small hotspot loop| G[Micro-optimizations]
        ```
        
        ## Practical checklist (use in real projects)
        
        - Measure with cProfile or simple timers
        - Optimize algorithm first
        - Replace list membership with set when needed
        - Prefer built-ins and comprehensions
        - Avoid unnecessary work inside loops
        - Use NumPy for large numeric computations
        
        ## Remember
        
        - Profile before optimizing
        - Algorithm choices give the biggest wins
        - Built-ins and sets are common easy wins
        - Keep code readable and maintainable
        
#Python#Advanced#Performance