MongoDb - MongoDB Aggregation
1. What is MongoDB Aggregation?
MongoDB aggregation is a powerful framework used to process data and return computed results. It works like SQL’s GROUP BY
, JOIN
, and aggregate functions combined, but more flexible.
You use the aggregate()
method with stages that process documents step by step.
2. Aggregation Pipeline Structure
The aggregation pipeline is an array of stages. Each stage transforms the documents and passes them to the next stage.
db.collection.aggregate([
{ stage1 },
{ stage2 },
...
])
Some commonly used stages:
-
$match → Filters documents (like
WHERE
in SQL) -
$group → Groups documents and calculates aggregates
-
$project → Includes, excludes, or reshapes fields
-
$sort → Sorts documents
-
$limit / $skip → Pagination
-
$lookup → Joins with another collection
-
$unwind → Breaks arrays into separate documents
3. Basic Example
Problem: Find the total number of orders per customer.
db.orders.aggregate([
{
$group: {
_id: "$customerId", // Group by customerId
totalOrders: { $sum: 1 } // Count orders
}
}
])
Output Example:
[
{ "_id": 101, "totalOrders": 5 },
{ "_id": 102, "totalOrders": 3 }
]
4. Using $match
+ $group
Problem: Find the total sales per customer for orders above $100.
db.orders.aggregate([
{
$match: { amount: { $gt: 100 } } // Filter orders > $100
},
{
$group: {
_id: "$customerId",
totalSales: { $sum: "$amount" }
}
}
])
5. $project
Stage
Problem: Show only name
and total
fields, excluding _id
.
db.orders.aggregate([
{
$project: {
_id: 0,
name: 1,
total: 1
}
}
])
6. $sort
, $skip
, $limit
Problem: Get top 5 customers by sales.
db.orders.aggregate([
{
$group: {
_id: "$customerId",
totalSales: { $sum: "$amount" }
}
},
{ $sort: { totalSales: -1 } }, // Descending order
{ $limit: 5 } // Top 5
])
7. $lookup
(Joins in MongoDB)
Problem: Join orders
with customers
to get customer details.
db.orders.aggregate([
{
$lookup: {
from: "customers", // Other collection
localField: "customerId",
foreignField: "customerId",
as: "customerDetails"
}
}
])
Output Example:
[
{
"_id": 1,
"customerId": 101,
"amount": 250,
"customerDetails": [
{ "customerId": 101, "name": "John Doe", "email": "[email protected]" }
]
}
]
8. $unwind
for Arrays
If customerDetails
is an array, we can flatten it:
db.orders.aggregate([
{ $unwind: "$customerDetails" }
])
9. Real-Life Complex Example
Problem: Get the top 3 customers by total order amount, including their names from the customers
collection.
db.orders.aggregate([
{
$group: {
_id: "$customerId",
totalSpent: { $sum: "$amount" }
}
},
{ $sort: { totalSpent: -1 } },
{ $limit: 3 },
{
$lookup: {
from: "customers",
localField: "_id",
foreignField: "customerId",
as: "customerInfo"
}
},
{ $unwind: "$customerInfo" },
{
$project: {
_id: 0,
customerId: "$_id",
name: "$customerInfo.name",
email: "$customerInfo.email",
totalSpent: 1
}
}
])
10. Key Tips
-
Always start with
$match
if possible → improves performance. -
Use
$project
at the end to return only necessary fields. -
Indexes are used in
$match
,$sort
, and$lookup
if fields are indexed.