MySQL - Query Optimization in MySQL

1. What is Query Optimization?

Query optimization is the process of rewriting SQL queries and tuning the database so they run faster and consume fewer resources (CPU, memory, disk I/O).

  • Goal: Return results quickly, especially on large datasets.

  • Tools: EXPLAIN, indexes, caching, query rewriting, and schema design.


2. Common Techniques for Query Optimization

(a) Use Indexes Properly

  • Indexes make lookups, filtering, and joins faster.

  • Example:

-- Slow (no index, full table scan)
SELECT * FROM orders WHERE customer_id = 123;

-- Add index
CREATE INDEX idx_customer_id ON orders(customer_id);
  • After indexing, MySQL can directly jump to relevant rows instead of scanning the whole table.


(b) Avoid SELECT *

  • Fetch only the columns you need.

-- Bad
SELECT * FROM employees WHERE department = 'Sales';

-- Good
SELECT id, name FROM employees WHERE department = 'Sales';
  • This reduces data transfer and memory usage.


(c) Use EXPLAIN

  • Run queries with EXPLAIN to see execution plans.

  • If you see ALL in the type column → it’s a full table scan → consider adding indexes.


(d) Optimize Joins

  • Always join on indexed columns.

  • Put the smaller result set first in multi-joins if possible.

SELECT o.id, c.name
FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE c.region = 'East';

→ Index on customers.region and orders.customer_id makes this much faster.


(e) Use LIMIT for Large Results

-- Bad: returns millions of rows
SELECT * FROM transactions;

-- Good: paginate
SELECT * FROM transactions ORDER BY id LIMIT 50 OFFSET 0;

(f) Avoid Functions on Indexed Columns

-- Bad: prevents index usage
SELECT * FROM users WHERE YEAR(created_at) = 2024;

-- Good: use range
SELECT * FROM users 
WHERE created_at BETWEEN '2024-01-01' AND '2024-12-31';

(g) Partition Large Tables

Split huge tables (e.g., orders by year). Queries only scan relevant partitions.


(h) Use Query Caching / Results Caching

  • MySQL query cache is deprecated, but you can use application-level caching (Redis, Memcached) for repeated queries.


(i) Avoid N+1 Queries (Application Side)

  • Instead of running 100 queries in a loop, use a JOIN or IN clause.


3. Real-World Example (E-commerce Orders)

Problem: Slow query

SELECT * FROM orders WHERE YEAR(order_date) = 2024 AND customer_id = 123;
  • EXPLAIN shows full table scan (ALL).

Optimization:

  1. Add an index:

CREATE INDEX idx_order_date_customer ON orders(order_date, customer_id);
  1. Rewrite query:

SELECT * FROM orders 
WHERE order_date BETWEEN '2024-01-01' AND '2024-12-31'
AND customer_id = 123;
  • Now MySQL can use the index → query runs much faster.

Summary:

  • Use indexes smartly.

  • Check queries with EXPLAIN.

  • Rewrite queries to avoid expensive operations.

  • Limit results, avoid SELECT *, and design schema for scalability.