Python - Python Bytecode and Disassembly

Python is an interpreted language, but it does not execute your source code directly. Instead, Python first compiles your .py code into an intermediate form called bytecode, which is then executed by the Python Virtual Machine (PVM). Understanding bytecode and how it works can help in debugging, performance optimization, and gaining deeper insight into Python’s execution model.


1. What is Python Bytecode

Bytecode is a low-level, platform-independent representation of your Python program. When you run a Python script, the interpreter performs the following steps:

  1. Parses the source code

  2. Compiles it into bytecode

  3. Executes the bytecode using the Python Virtual Machine

This bytecode is stored in .pyc files inside the __pycache__ directory. It is not machine code but a set of instructions designed for the Python interpreter.


2. Role of the Python Virtual Machine (PVM)

The PVM is responsible for executing bytecode. It reads one bytecode instruction at a time and performs the corresponding operation. This is why Python is considered slower than compiled languages like C, as execution happens through an additional abstraction layer.


3. Disassembly Using the dis Module

Python provides a built-in module called dis that allows you to inspect the bytecode generated from Python source code.

Example:

import dis

def add(a, b):
    return a + b

dis.dis(add)

Output (simplified):

  2           0 LOAD_FAST                0 (a)
              2 LOAD_FAST                1 (b)
              4 BINARY_ADD
              6 RETURN_VALUE

4. Understanding Bytecode Instructions

Each line in the disassembled output represents a bytecode instruction:

  • LOAD_FAST: Loads a local variable onto the stack

  • BINARY_ADD: Performs addition on top two stack values

  • RETURN_VALUE: Returns the result from the function

Python uses a stack-based execution model, meaning operations are performed using a stack rather than registers.


5. Code Objects and Compilation

When Python compiles code, it creates a code object that contains:

  • Bytecode instructions

  • Constants used in the function

  • Variable names

  • Metadata such as filename and line numbers

You can inspect this using:

print(add.__code__.co_code)

Other useful attributes:

  • co_consts

  • co_names

  • co_varnames


6. Why Bytecode Matters

Understanding bytecode is useful for:

Performance Optimization
You can identify inefficient operations and understand how Python executes them internally.

Debugging
Helps trace unexpected behavior at a lower level.

Security and Reverse Engineering
Analyzing .pyc files to understand compiled Python programs.

Understanding Language Behavior
Provides insight into how Python handles loops, function calls, and variable scope.


7. Bytecode Optimization

Python performs some optimizations during compilation, such as:

  • Constant folding (e.g., replacing 2 + 3 with 5)

  • Removing unreachable code

  • Efficient handling of variable access

However, Python does not perform heavy optimization like a traditional compiler.


8. Differences Across Python Versions

Bytecode is not stable across Python versions. The same code can produce different bytecode instructions depending on the interpreter version. This is why .pyc files are version-specific.


9. Limitations of Bytecode Analysis

  • It is low-level and harder to interpret compared to source code

  • Not all performance issues are visible at the bytecode level

  • Requires understanding of stack-based execution


10. Practical Use Case

Consider comparing two implementations of a function to see which produces fewer or more efficient bytecode instructions. This can help you choose better coding patterns in performance-critical sections.


Conclusion

Python bytecode acts as the bridge between human-readable source code and machine execution within the Python Virtual Machine. By using the dis module and understanding how bytecode works, developers can gain deeper control over performance, debugging, and internal behavior of Python programs.