MongoDb - MongoDB Aggregation
1. What is MongoDB Aggregation?
MongoDB aggregation is a powerful framework used to process data and return computed results. It works like SQL’s GROUP BY, JOIN, and aggregate functions combined, but more flexible.
You use the aggregate() method with stages that process documents step by step.
2. Aggregation Pipeline Structure
The aggregation pipeline is an array of stages. Each stage transforms the documents and passes them to the next stage.
db.collection.aggregate([
{ stage1 },
{ stage2 },
...
])
Some commonly used stages:
-
$match → Filters documents (like
WHEREin SQL) -
$group → Groups documents and calculates aggregates
-
$project → Includes, excludes, or reshapes fields
-
$sort → Sorts documents
-
$limit / $skip → Pagination
-
$lookup → Joins with another collection
-
$unwind → Breaks arrays into separate documents
3. Basic Example
Problem: Find the total number of orders per customer.
db.orders.aggregate([
{
$group: {
_id: "$customerId", // Group by customerId
totalOrders: { $sum: 1 } // Count orders
}
}
])
Output Example:
[
{ "_id": 101, "totalOrders": 5 },
{ "_id": 102, "totalOrders": 3 }
]
4. Using $match + $group
Problem: Find the total sales per customer for orders above $100.
db.orders.aggregate([
{
$match: { amount: { $gt: 100 } } // Filter orders > $100
},
{
$group: {
_id: "$customerId",
totalSales: { $sum: "$amount" }
}
}
])
5. $project Stage
Problem: Show only name and total fields, excluding _id.
db.orders.aggregate([
{
$project: {
_id: 0,
name: 1,
total: 1
}
}
])
6. $sort, $skip, $limit
Problem: Get top 5 customers by sales.
db.orders.aggregate([
{
$group: {
_id: "$customerId",
totalSales: { $sum: "$amount" }
}
},
{ $sort: { totalSales: -1 } }, // Descending order
{ $limit: 5 } // Top 5
])
7. $lookup (Joins in MongoDB)
Problem: Join orders with customers to get customer details.
db.orders.aggregate([
{
$lookup: {
from: "customers", // Other collection
localField: "customerId",
foreignField: "customerId",
as: "customerDetails"
}
}
])
Output Example:
[
{
"_id": 1,
"customerId": 101,
"amount": 250,
"customerDetails": [
{ "customerId": 101, "name": "John Doe", "email": "[email protected]" }
]
}
]
8. $unwind for Arrays
If customerDetails is an array, we can flatten it:
db.orders.aggregate([
{ $unwind: "$customerDetails" }
])
9. Real-Life Complex Example
Problem: Get the top 3 customers by total order amount, including their names from the customers collection.
db.orders.aggregate([
{
$group: {
_id: "$customerId",
totalSpent: { $sum: "$amount" }
}
},
{ $sort: { totalSpent: -1 } },
{ $limit: 3 },
{
$lookup: {
from: "customers",
localField: "_id",
foreignField: "customerId",
as: "customerInfo"
}
},
{ $unwind: "$customerInfo" },
{
$project: {
_id: 0,
customerId: "$_id",
name: "$customerInfo.name",
email: "$customerInfo.email",
totalSpent: 1
}
}
])
10. Key Tips
-
Always start with
$matchif possible → improves performance. -
Use
$projectat the end to return only necessary fields. -
Indexes are used in
$match,$sort, and$lookupif fields are indexed.