Aggregation Framework In MongoDB

MongoDB is an open source document-oriented database. These types of databases are highly flexible, allowing structure variations in the documents, and MongoDB can save even documents partially. With lots of powerful features and advantages, the Aggregation Framework is one of the most powerful features of MongoDB.

In MongoDB, the Aggregation framework is used to process data records/documents and return computed results. We can group documents and perform various operations, use join within a database, merge collection, and many more exciting things.

Let’s begin with an aggregation framework; to use aggregation, it’s required to use the pipeline. Aggregation pipeline consists of stages, and at each stage, the document is passed sequentially, transformed/output documents of the stage served to next stage as input and this process goes until the last stage.

To write/initiate aggregation, run the following mongo shell command

db.getCollection(collectionName).aggregate(pipeline)

where collectionName is the name of the collection on which aggregation is performed and pipeline is an array of stages known as aggregation pipeline.

Aggregation pipeline stages can be single or multiple.

Some of most common and popular pipeline stage operators which are used :

$match : It is used to filter documents and reduce the amount of documents which is given as input to next stage.
$unwind : It is used to deconstruct input an array to return each document for each element.
$group : It is used to group input document by specified some value as _id expression (expression document field).
$project : It is used to select only specific fields from a collection(s).
$sort : it is used to rearrange input documents.
$skip : It is used to to skip over specified number of documents and passes remaining documents to next stage.
$limit : Limits the number of documents to next stage.
$out: it simply writes aggregation pipeline resultant to specified collection.

Let’s discuss and explore some aggregation pipeline with the help of example:

Sample collections :

  
 authors
 
 	{
   			...
   			_id: ObjectId("61a7759ef8fa5fefde2aec3b"),
   			createdAt: 1622928483,
   			books: [
       		                {
           			  bookId: ObjectId("61a7759ef8fa5fefde2aec1b"),
       		                },
       		                {
                      bookId: ObjectId("61a7759ef8fa5fefde2aec2b"),
       		                },
   			]
   			...
 	}
 
books
 
 	{
   			...
   			_id:  ObjectId("61a7759ef8fa5fefde2aec1b"),
   			title: 'A Time to Kill by John Grisham',
   			description: 'All systems pro.',
   			price: 1200,
   			...
 	}

Lookup on objects array having objectId as key (Frogein key)

[
    {
        $unwind: { path: '$books' }
    },
    {
        $lookup: {
            from: 'books',
            localField: 'books.booksId',
            foreignField: '_id',
            as: 'books.book'
        }
    },
    {
        $unwind: { path: '$books.book' }
    },
    {
        $group: {
            _id: '$_id',
            books: {
                $push: '$books'
            }
        }
    },
    {
        $lookup: {
            from: 'orders',
            localField: '_id',
            foreignField: '_id',
            as: 'authorAndBookDetails'
        }
    },
    {
        $unwind: { path: '$authorAndBookDetails' }
    },
    {
        $addFields: { 'authorAndBookDetails.books': '$books' }
    },
    {
        $replaceRoot: { newRoot: '$authorAndBookDetails' }
    }
]

Let’s breakdown our query and discuss each stage in some details

Stage 1 :


   {
       $unwind: {
           path: '$books'
       }
   }

By using the $unwind operator, we simply deconstruct the input array books and create output documents for each element in the books array.

Stage 2 :

   
   {
       $lookup: {
           from: 'books',
           localField: 'books.booksId',
           foreignField: '_id',
           as: 'books.book'
       }
   }

$lookup operator , in this step we are data from another collection i.e books, as $lookup performs left outer join on collection in the same database and outputs document array against alias.

Stage 3 :

  
   {
     $unwind: {
         path: '$books.book'
     }
   }

In this stage , we are again going to use $unwind but field is different i.e $books.book.

At this stage we have get data for books for each author documents.

Stage 4 :

  
   {
       $group: {
           _id: '$_id',
           books: {
               $push: '$books'
           }
       }
   }

Here we are categorizing our unwinded books array using $group, using $_id as expression/group key Id and push each element $books in books array. It simply takes all the books from last stage and push it into arrays called as books

Stage 5 :

  
   {
       $lookup: {
           from: 'authors',
           localField: '_id',
           foreignField: '_id',
           as: 'authorAndBookDetails'
       }
   },
   {
       $unwind: {
           path: '$authorAndBookDetails'
   }

This stage is pretty straightforward, in this we are fetching data of author again.

Stage 6 :

  
   {
       $addFields: {
           'authorAndBookDetails.books': '$books'
       }
   }

In this stage , $addFields is used to put books array from root level to inside authorAndBookDetails.

Stage 7 :

  
   {
       $replaceRoot: {
           newRoot: '$authorAndBookDetails'
       }
   }

In this stage , we are replacing all others existing fields with authorAndBookDetails and makes it as root document.

And here is our desired result

  
  {
    ...
    _id: ObjectId("61a7759ef8fa5fefde2aec3b"),
    createdAt: 1622928483,
    books: [
        {
            bookId: ObjectId("61a7759ef8fa5fefde2aec1b"),
            book: {
                _id:  ObjectId("61a7759ef8fa5fefde2aec1b"),
                title: 'A Time to Kill by John Grisham',
                description: 'All systems pro.',
                price: 1200,
            }
        },
        {
            bookId: ObjectId("61a7759ef8fa5fefde2aec2b"),
            book: {
                _id:  ObjectId("61a7759ef8fa5fefde2aec2b"),
                ...
        },
        
    ]
    ...
 }

Conclusion

In this article, we have seen some of various stage operators in aggregation pipeline to retrieve desired result.Beyond this aggregation has many more operators which makes it more flexible and powerful, which let you reshape documents, unpacking nested structures and regrouping them as needed.

Please feel free to provide any feedback in below comment section

References : https://docs.mongodb.com/manual/aggregation/

Post Views: 780