MongoDb - Storage Engines in MongoDB (WiredTiger)

A storage engine is the internal component of a database that manages how data is stored on disk and how it is retrieved. In simple terms, it decides:

  • How data is written to disk

  • How memory (RAM) is used

  • How concurrency (multiple users accessing data at the same time) is handled

  • How crash recovery works

In MongoDB, the default and most important storage engine is WiredTiger.


Why Storage Engines Matter

When you build real-world applications (e-commerce, banking apps, student portals), the database must:

  • Handle many users simultaneously

  • Avoid data corruption during crashes

  • Maintain high performance

  • Use disk space efficiently

The storage engine directly affects all of these.


WiredTiger – Detailed Explanation

1. Document-Level Concurrency

Older storage engines used database-level or collection-level locking. That means if one operation was writing data, other operations had to wait.

WiredTiger supports document-level locking.

This means:

  • Multiple users can modify different documents in the same collection at the same time.

  • Performance improves significantly under heavy load.

Example:
If 100 users update different profiles simultaneously, they won’t block each other.


2. Compression

WiredTiger supports data compression by default.

It compresses:

  • Data files

  • Indexes

Benefits:

  • Reduces disk usage

  • Reduces I/O (Input/Output) operations

  • Improves performance on disk-heavy systems

This is very important in production environments where storage cost matters.


3. Journaling (Crash Recovery)

Journaling ensures that data is not lost if the system crashes.

When a write operation happens:

  1. It is first recorded in a journal file.

  2. Then written to the main data files.

If the server crashes:

  • MongoDB replays the journal

  • Restores data to a consistent state

This provides durability, which is part of ACID properties.


4. Checkpointing

WiredTiger periodically creates checkpoints.

A checkpoint:

  • Saves a stable snapshot of data to disk.

  • Reduces recovery time after crashes.

This means the system does not need to replay the entire history during recovery.


5. Memory Usage (Cache System)

WiredTiger uses an internal cache system.

  • Frequently accessed data is kept in RAM.

  • Rarely accessed data stays on disk.

By default, it uses around 50% of available RAM for cache (in standalone setups).

This improves read performance significantly.


Other Storage Engines (Brief Mention)

Earlier versions of MongoDB used:

  • MMAPv1 (deprecated)

MMAPv1 had:

  • Collection-level locking

  • Less efficient concurrency

  • No compression

That’s why WiredTiger replaced it.