MongoDb - MongoDB Aggregation

1. What is MongoDB Aggregation?

MongoDB aggregation is a powerful framework used to process data and return computed results. It works like SQL’s GROUP BY, JOIN, and aggregate functions combined, but more flexible.

You use the aggregate() method with stages that process documents step by step.


2. Aggregation Pipeline Structure

The aggregation pipeline is an array of stages. Each stage transforms the documents and passes them to the next stage.

db.collection.aggregate([
  { stage1 },
  { stage2 },
  ...
])

Some commonly used stages:

  • $match → Filters documents (like WHERE in SQL)

  • $group → Groups documents and calculates aggregates

  • $project → Includes, excludes, or reshapes fields

  • $sort → Sorts documents

  • $limit / $skip → Pagination

  • $lookup → Joins with another collection

  • $unwind → Breaks arrays into separate documents


3. Basic Example

Problem: Find the total number of orders per customer.

db.orders.aggregate([
  {
    $group: {
      _id: "$customerId",       // Group by customerId
      totalOrders: { $sum: 1 }  // Count orders
    }
  }
])

Output Example:

[
  { "_id": 101, "totalOrders": 5 },
  { "_id": 102, "totalOrders": 3 }
]

4. Using $match + $group

Problem: Find the total sales per customer for orders above $100.

db.orders.aggregate([
  { 
    $match: { amount: { $gt: 100 } }   // Filter orders > $100
  },
  {
    $group: {
      _id: "$customerId",
      totalSales: { $sum: "$amount" }
    }
  }
])

5. $project Stage

Problem: Show only name and total fields, excluding _id.

db.orders.aggregate([
  {
    $project: {
      _id: 0,
      name: 1,
      total: 1
    }
  }
])

6. $sort, $skip, $limit

Problem: Get top 5 customers by sales.

db.orders.aggregate([
  {
    $group: {
      _id: "$customerId",
      totalSales: { $sum: "$amount" }
    }
  },
  { $sort: { totalSales: -1 } },  // Descending order
  { $limit: 5 }                   // Top 5
])

7. $lookup (Joins in MongoDB)

Problem: Join orders with customers to get customer details.

db.orders.aggregate([
  {
    $lookup: {
      from: "customers",       // Other collection
      localField: "customerId",
      foreignField: "customerId",
      as: "customerDetails"
    }
  }
])

Output Example:

[
  {
    "_id": 1,
    "customerId": 101,
    "amount": 250,
    "customerDetails": [
      { "customerId": 101, "name": "John Doe", "email": "[email protected]" }
    ]
  }
]

8. $unwind for Arrays

If customerDetails is an array, we can flatten it:

db.orders.aggregate([
  { $unwind: "$customerDetails" }
])

9. Real-Life Complex Example

Problem: Get the top 3 customers by total order amount, including their names from the customers collection.

db.orders.aggregate([
  {
    $group: {
      _id: "$customerId",
      totalSpent: { $sum: "$amount" }
    }
  },
  { $sort: { totalSpent: -1 } },
  { $limit: 3 },
  {
    $lookup: {
      from: "customers",
      localField: "_id",
      foreignField: "customerId",
      as: "customerInfo"
    }
  },
  { $unwind: "$customerInfo" },
  {
    $project: {
      _id: 0,
      customerId: "$_id",
      name: "$customerInfo.name",
      email: "$customerInfo.email",
      totalSpent: 1
    }
  }
])

10. Key Tips

  • Always start with $match if possible → improves performance.

  • Use $project at the end to return only necessary fields.

  • Indexes are used in $match, $sort, and $lookup if fields are indexed.