Software Testing - Stress Testing

Stress Testing is a type of performance testing used to determine how a system behaves under extreme load conditions, beyond its normal capacity.

It checks:

  • Breaking point (when the system fails)

  • How gracefully it fails

  • How quickly it recovers

In simple terms:

Stress Testing pushes the system harder than normal to see when it crashes and how well it recovers.


2. Goals of Stress Testing

✔ Identify the system’s breaking point

At what user/load level does the system stop responding?

✔ Understand system behavior under extreme conditions

Does it slow down, hang, crash, or throw errors?

✔ Evaluate recovery capability

How fast does it return to normal after overload?

✔ Identify bottlenecks

Weakest areas such as:

  • Database

  • APIs

  • Server CPU

  • Memory leaks

  • Network limits

✔ Verify stability and error handling

System should fail gracefully, not crash abruptly.


3. When to Perform Stress Testing

  • Before launching an application

  • Before festival sales or marketing events

  • When expecting sudden traffic peaks

  • After major infrastructure upgrades

  • When performance issues are suspected

  • When adding new hardware or load balancers


4. Types of Stress Testing

1) Application Stress Testing

Find defects in app logic under extreme load.
(E.g., API timeouts, crashes)

2) System Stress Testing

Test different components together under stress
(e.g., DB + server + network under heavy load)

3) Spike Testing

Sudden load increase in seconds
(e.g., 100 users → 10,000 users instantly)

4) Distributed Stress Testing

Multiple remote machines generate load together.

5) Exploratory Stress Testing

No specific numbers — random extreme loads to see unexpected behavior.


Comparison with Load Testing

Feature Load Testing Stress Testing
Purpose Test under expected load Test beyond normal load
Load level Normal / planned Extreme / overload
Outcome Validate performance Find breaking point & recovery
Goal Stability during peak traffic Behavior during and after failure

5. Stress Testing Metrics

Important metrics include:

Performance

  • Peak load supported

  • Response time under stress

  • Throughput

  • Latency

Reliability

  • Error rate

  • Timeout rate

  • Crash point

Resource Usage

  • CPU max usage

  • Memory consumption

  • Disk I/O

  • DB query spikes

Recovery

  • Time to recover after overload

  • Auto-scaling behavior (if cloud setup)


6. Stress Testing Process (Step-by-Step)

Step 1 — Define breaking-point goals

  • Example: “Find out at which user count API fails.”

Step 2 — Identify critical scenarios

Examples:

  • Login

  • Search

  • Add to cart

  • Checkout

  • File upload

Step 3 — Create extreme load profiles

Example:

  • Expected users: 2,000

  • Stress test: push up to 10,000

Step 4 — Prepare test environment

Must be as close to production as possible.

Step 5 — Execute test

Increase load gradually or apply sudden spikes.

Step 6 — Monitor system behavior

Track:

  • Response time

  • CPU/memory

  • DB queries

  • Error logs

Step 7 — Identify breaking point

Find the level at which:

  • System slows down drastically

  • Errors increase

  • Server crashes

Step 8 — Analyze results

Which component failed first?

  • DB?

  • Server CPU?

  • Network?

  • API latency?

Step 9 — Fix bottlenecks + Retest


7. Stress Testing Example

Scenario: E-commerce site during Big Sale

Normal expected traffic:

3,000 users

Stress test

Push step-by-step:

  • 3,000 → 5,000 → 8,000 → 12,000 → 15,000 users

What happens:

  • At 8,000 users: response time reaches 7 seconds

  • At 10,000 users: checkout API begins returning errors

  • At 12,000 users: database CPU spikes to 100%

  • System crashes at 13,500 users

After recovery test:

  • Server takes 3 minutes to recover

  • Auto-scaling adds 2 servers → system stable again

Outcome:
Major bottleneck found in checkout API + database indexing needed.


8. Popular Stress Testing Tools

  • JMeter

  • Gatling

  • LoadRunner

  • Locust

  • k6

  • BlazeMeter

  • Neoload

Cloud platforms:

  • AWS Load Testing

  • Azure Load Testing


9. Common Mistakes in Stress Testing

  • Testing in a weak test environment

  • No monitoring setup

  • Not testing recovery mechanisms

  • Using unrealistic user data

  • No clear breaking-point goals

  • Only checking app, not infra (DB, server, cache)

10. Best Practices

  • Use realistic traffic models

  • Stress test core user flows only

  • Monitor server + database + app logs together

  • Combine stress test with spike tests

  • Always test system recovery and failover

  • Document exact failure point

  • Involve developers + DevOps for root-cause analysis