Chapter 1: Basic Aggregation - $match and $project
Shaping documents with $project
The correct answers are the following:
- Once we specify a field to retain or perform some computation in a $project stage, we must specify all fields we wish to retain. The only exception to this is the _id field.
$project implicitly removes all other fields once we have retained, reshaped, or computed a new field. The exception to this is the _id field, which we must explicitly remove.
- Beyond simply removing and retaining fields, $project lets us add new fields.
We can add new fields and reassign the values of existing ones, shaping the documents into different datastructures and computing values using expressions.
The remaining options are incorrect.
$project can only be used once within an Aggregation pipeline.
$project cannot be used to assign new values to existing fields.
Chapter 1: Basic Aggregation - $match and $project
Lab - Computing Fields
Problem:
Our movies dataset has a lot of different documents, some with more convoluted titles than others. If we'd like to analyze our collection to find movie titles that are composed of only one word, we could fetch all the movies in the dataset and do some processing in a client application, but the Aggregation Framework allows us to do this on the server!
Using the Aggregation Framework, find a count of the number of movies that have a title composed of one word. To clarify, "Cinderella" and "3-25" should count, where as "Cast Away" would not.
Make sure you look into the $split String expression and the $size Array expression
To get the count, you can append itcount() to the end of your pipeline
db.movies.aggregate([...]).itcount()
db.movies.aggregate([
{
$match: {
title: {
$type: "string"
}
}
},
{
$project: {
title: { $split: ["$title", " "] },
_id: 0
}
},
{
$match: {
title: { $size: 1 }
}
}
]).itcount()
We begin with a $match stage, ensuring that we only allow movies where the title is a string
db.movies.aggregate([
{
$match: {
title: {
$type: "string"
}
}
},
Next is our $project stage, splitting the title on spaces. This creates an array of strings
{
$project: {
title: { $split: ["$title", " "] },
_id: 0
}
},
We use another $match stage to filter down to documents that only have one element in the newly computed title field, and use itcount() to get a count
{
$match: {
title: { $size: 1 }
}
}
]).itcount()