Linux - RAID Concepts (Redundant Array of Independent Disks)
RAID is a technology that uses multiple physical disks to improve:
-
Performance
-
Reliability
-
Fault tolerance
-
Storage capacity (in some RAID levels)
RAID can be implemented in two ways:
-
Software RAID → via Linux
mdadm -
Hardware RAID → via RAID controller cards
Why RAID?
RAID helps solve key storage problems:
✔ Disk failure protection
✔ Better read/write performance
✔ Combining multiple disks for larger storage
✔ Continuous data availability
Types of RAID Levels
Below are the commonly used RAID levels and what they mean.
RAID 0 – Striping
Focus: High performance
Fault Tolerance: ❌ None
How it works
Data is split evenly into blocks across multiple disks.
Disk1: A1 A3 A5
Disk2: A2 A4 A6
Benefits
-
Fast read and write speeds
-
Utilizes full capacity:
Total size = Disk1 + Disk2 + Disk3 …
Drawbacks
-
If one disk fails → everything is lost
-
No redundancy
Best For
-
Gaming
-
High-speed workloads
-
Cache disks
RAID 1 – Mirroring
Focus: Redundancy (high safety)
Fault Tolerance: ✔ Yes (can survive 1 disk failure)
How it works
Data is duplicated identically on two disks.
Disk1: A1 A2 A3
Disk2: A1 A2 A3
Benefits
-
High reliability
-
Fast reads (data can come from any disk)
-
Simple setup
Drawbacks
-
Storage is cut in half:
Total size = size of one disk -
Write speed may be slightly slower
Best For
-
Critical data
-
Small servers
-
Databases
RAID 5 – Striping + Parity (Most common)
Focus: Balance of performance and redundancy
Fault Tolerance: ✔ Can survive 1 disk failure
How it works
Data + parity distributed across disks
Disk1: A1 A2 P3
Disk2: A1 P2 A3
Disk3: P1 A2 A3
Parity helps rebuild data if one disk fails.
Benefits
-
Good performance
-
Fault-tolerant
-
More efficient than RAID 1
-
Total size = (N−1) disks
Drawbacks
-
Slow write performance due to parity
-
If 2 disks fail → data lost
-
Rebuild times can be long
Best For
-
File servers
-
Backup servers
-
Read-heavy workloads
RAID 6 – Double Parity
Focus: High redundancy
Fault Tolerance: ✔ Can survive 2 disk failures
How it works
Like RAID 5 but with two parity blocks.
Benefits
-
Very high safety
-
Good for large disks (where rebuild takes long)
Drawbacks
-
Slower writes than RAID 5
-
More parity overhead
-
Total size = (N−2) disks
Best For
-
Large enterprise storage
-
Archival systems
-
High-availability servers
RAID 10 (RAID 1 + RAID 0) – Best Combined Level
Focus: Performance + Redundancy
Fault Tolerance: ✔ Can survive multiple disk failures (as long as not in same mirror pair)
How it works
First mirror, then stripe:
Disk1 ⇄ Disk2 → mirror
Disk3 ⇄ Disk4 → mirror
Then stripe across the two mirrors
Benefits
-
Very fast
-
High reliability
-
Great for databases
Drawbacks
-
Needs minimum 4 disks
-
Storage = 50% of total
Best For
-
Databases
-
Heavy I/O workloads
-
Virtual machines
RAID Comparison Table
| RAID Level | Min Disks | Fault Tolerance | Speed | Storage Efficiency |
|---|---|---|---|---|
| RAID 0 | 2 | ❌ None | ⭐⭐⭐⭐ | 100% |
| RAID 1 | 2 | ✔ One | ⭐⭐ (writes) ⭐⭐⭐ (reads) | 50% |
| RAID 5 | 3 | ✔ One | ⭐⭐⭐ | (N−1)/N |
| RAID 6 | 4 | ✔ Two | ⭐⭐ | (N−2)/N |
| RAID 10 | 4 | ✔ Multiple | ⭐⭐⭐⭐ | 50% |
Software RAID (mdadm)
Linux commonly uses mdadm to create RAID arrays.
Example: Create RAID 5
sudo mdadm --create /dev/md0 --level=5 --raid-devices=3 /dev/sdb /dev/sdc /dev/sdd
Check RAID status:
cat /proc/mdstat
Key RAID Terms
Striping
Splitting data into chunks and writing across disks
→ High performance
Mirroring
Making copies of data
→ High redundancy
Parity
Extra information that allows reconstructing lost data
→ Used in RAID 5/6
Hot Spare
A standby disk that automatically replaces a failed disk
When NOT to Use RAID
RAID is not a backup.
It protects from disk failure but not from:
-
accidental deletion
-
ransomware
-
corruption
-
fire/flood theft
Always keep separate backups.