MySQL - Partitioning tables in MySQL.

1. What is Table Partitioning?

Partitioning is a way to split a large table into smaller, more manageable pieces called partitions.

  • Logically, the table still looks like one table.

  • Physically, the data is divided into multiple partitions.

  • Queries can be faster because MySQL can skip scanning partitions that are not relevant.


2. Types of Partitioning in MySQL

MySQL supports several partitioning strategies:

  1. RANGE Partitioning

    • Rows are assigned to partitions based on ranges of column values.

    CREATE TABLE sales (
        id INT,
        amount DECIMAL(10,2),
        sale_date DATE
    )
    PARTITION BY RANGE (YEAR(sale_date)) (
        PARTITION p0 VALUES LESS THAN (2020),
        PARTITION p1 VALUES LESS THAN (2021),
        PARTITION p2 VALUES LESS THAN (2022),
        PARTITION p3 VALUES LESS THAN MAXVALUE
    );
    
    • Example: All sales before 2020 go to partition p0, 2020 → p1, etc.


  1. LIST Partitioning

    • Similar to RANGE, but based on specific values.

    CREATE TABLE customers (
        id INT,
        name VARCHAR(100),
        region VARCHAR(20)
    )
    PARTITION BY LIST COLUMNS (region) (
        PARTITION east VALUES IN ('East'),
        PARTITION west VALUES IN ('West'),
        PARTITION north VALUES IN ('North'),
        PARTITION south VALUES IN ('South')
    );
    

  1. HASH Partitioning

    • Rows are distributed across partitions using a hash function.

    CREATE TABLE employees (
        id INT,
        name VARCHAR(100),
        dept_id INT
    )
    PARTITION BY HASH (dept_id)
    PARTITIONS 4;
    
    • Here, data is split into 4 partitions based on hash of dept_id.


  1. KEY Partitioning

    • Similar to HASH, but MySQL chooses its own hashing function (usually used with primary/unique keys).

    CREATE TABLE orders (
        order_id INT PRIMARY KEY,
        customer_id INT,
        amount DECIMAL(10,2)
    )
    PARTITION BY KEY (order_id)
    PARTITIONS 5;
    

3. Benefits of Partitioning

  • Performance: Large queries can be faster because only relevant partitions are scanned.

  • Manageability: You can easily drop or archive old partitions (e.g., old sales data).

  • Maintenance: You can rebuild or optimize partitions individually.


4. Limitations of Partitioning

  • Partition key must be part of all primary keys and unique keys.

  • No foreign keys allowed on partitioned tables.

  • Not ideal for small tables (overhead).

  • Too many partitions can reduce performance.


Real-world example:
An e-commerce store might partition its orders table by year:

  • 2020 orders in one partition,

  • 2021 orders in another,

  • 2022 orders in another.
    This way, reporting queries (e.g., "total sales in 2022") only scan one partition.