It's often the case that you'll use a MongoDB aggregation pipeline to group related documents. Fortunately, you can do that easily in Spring Boot with MongoTemplate.

In this guide, I'll show you a few ways that you can use the group aggregation operation. When you're finished here, you should have enough info to use it for your own requirements.

Keep in mind: if you're a TL;DR kind of person, you can just grab the code on GitHub.

The Business Requirements

Your boss Smithers walks into your office. He appears a little more mellow than usual.

"Yo," he says softly. "I need you to put in some new service requests on the CRM app."

You're not clear why Smithers seems so chilled today.

"First, we need to be able to count all contacts by source," he says almost inaudibly. "Then, we need a request that returns all distinct source values. Finally, we need to group contacts by source."

He maintains a blank look on his face for about 10 seconds. Then he walks out of your office without saying anything more.

Building on What You've Already Got

You've already created an aggregation pipeline within your Spring Boot application using MongoTemplate. I won't rehash the basics of how to make that work here.

Now it's time to add some new methods to that ContactService class you created. But first, you need a support class.

Why? Because with a couple of the groupings you'll create here, you won't return fields that are found within the Contact class. 

Take the first requirement Smithers mentioned: you need to return the number of contacts per source.

Well, there's no "number of contacts per source" field or anything like it in the Contact class. So you need a new class that handles counts.

To keep things simple here, just create an inner class within ContactService. Make it look like this:

	public static class ContactInfo {
		private Source source;
		private Long count;
		private List<String> contacts;
		
		public Source getSource() {
			return source;
		}
		public void setSource(Source source) {
			this.source = source;
		}
		public Long getCount() {
			return count;
		}
		public void setCount(Long count) {
			this.count = count;
		}
		public List<String> getContacts() {
			return contacts;
		}
		public void setContacts(List<String> contacts) {
			this.contacts = contacts;
		}		
	}

Nothing too complicated there: just a plain old Java object (POJO) with three fields: source, count, and contacts.

For the purposes of that first requirement that Smithers mentioned, the count field will hold the count per source.

Counting on You

Now that you've got that new inner class in place, create a new method:

	public List<ContactInfo> countContactsBySource() {
		AggregationOperation group = Aggregation.group("source").count().as("count");
		AggregationOperation project = Aggregation.project("count").and("source").previousOperation();
		
		Aggregation aggregation = Aggregation.newAggregation(group, project);

		List<ContactInfo> contactInfo = mongoTemplate.aggregate(aggregation, mongoTemplate.getCollectionName(Contact.class), ContactInfo.class).getMappedResults();
		
		return contactInfo;
	}

That's an aggregation pipeline with two stages. The first stage handles the grouping. The second stage handles projection.

Now, I'll take it apart bit by bit.

For the first stage, the Aggregation.group() method groups the documents by the specified field. In this case, the specified field is "source."

But what do you do with that grouping? You have options.

For this requirement, Smithers wants you to count the number of contacts specific to each source.

Fortunately, Aggregation.group() returns a GroupOperation object that lets you perform a variety of actions. One of those actions is the ability to count the documents in the group.

Unsurprisingly, the method that handles that action is called count().

But you still aren't done. Once you've counted the documents you still need to return the results. You do that by setting a specific field to the value of the count.

Remember, aggregations in MongoDB return results that look like JSON. They're technically BSON but that's a subject for a different article.

So you need that "count" property in the BSON. But how do you get it there?

You add it on the fly with that as() method you see above. In this case, .as("count") means: take the count of all documents per source and set it to the value of a field called "count."

So now when you translate the returned BSON into a Java object, you need an object that has a property called "count."

Fortunately, you just created that. See above.

But now you've got to tell MongoDB which fields to return. That's why that project stage is necessary.

And pay attention here: the order of the fields listed in that line is important!

If you switch around the order, you'll get runtime errors.

By the way, I've already covered the project aggregation operation. I won't rehash it all here.

And yes you need that .previousOperation() method in there. That gives you the results from the previous operation. If you leave it out, you won't see the "source" field.

By the way, if you want to know what this aggregation pipeline looks like on the MongoDB side of the house, here you go:

db.contacts.aggregate(
[
   {
      "$group":{
         "_id":"$source",
         "count":{
            "$sum":1
         }
      }
   },
   {
      "$project":{
         "count":1,
         "_id":0,
         "source":"$_id"
      }
   }
]
)

Run that in your MongoDB client and you'll get the same results you'll soon see here. 

Now pay particularly close attention to this line:

List<ContactInfo> contactInfo = mongoTemplate.aggregate(aggregation, mongoTemplate.getCollectionName(Contact.class), ContactInfo.class).getMappedResults();

You see that third parameter? It's ContactInfo.class not Contact.class.

Why? Because you aren't returning a list of contacts this time. Instead, you're returning info about the contacts. That's why you had to create the new class in the first place.

Everything else stays the same.

Your First Test

Okay, now it's time to test it out. As is usually the case around here, you can do that easily with some initialization code.

But first, let's assume that this is the data you currently have in your contacts database:

[ {
  "id" : "5fde12d60ab013769b67cf02",
  "firstName" : "Chew",
  "lastName" : "Bacca",
  "email" : "chewie@xmail.com",
  "source" : "WEBSITE_FORM",
  "status" : "ACTIVE",
  "statusChange" : 0,
  "linesOfBusiness" : [ "ANGULAR" ],
  "company" : "Working for Han",
  "title" : "Wookie",
  "authority" : false,
  "salesOwner" : {
    "id" : "5f78d8fbc1d3246ab4303f2b",
    "firstName" : "Darth",
    "lastName" : "Vader",
    "email" : "darth@xmail.com",
    "username" : "darth",
    "phoneNumber" : "474-555-1212"
  }
}, {
  "id" : "5fde1028792009283c603929",
  "firstName" : "JarJar",
  "lastName" : "Binks",
  "email" : "jarjar@xmail.com",
  "addresses" : [ {
    "street1" : "1400 Plum Way",
    "city" : "Onisius",
    "state" : "NM",
    "zip" : "80909",
    "addressType" : "HOME"
  } ],
  "source" : "WALKIN",
  "status" : "CONTACTED",
  "statusChange" : 0,
  "linesOfBusiness" : [ "FULL_STACK" ],
  "company" : "None",
  "title" : "Comic Relief",
  "authority" : false,
  "salesOwner" : {
    "id" : "5f78d8fbc1d3246ab4303f2b",
    "firstName" : "Darth",
    "lastName" : "Vader",
    "email" : "darth@xmail.com",
    "username" : "darth",
    "phoneNumber" : "474-555-1212"
  }
}, {
  "id" : "5fde117edd79e20e3ff6528c",
  "firstName" : "Lando",
  "lastName" : "Calrissian",
  "email" : "lando@xmail.com",
  "phones" : [ {
    "phone" : "(555) 555-5555",
    "phoneType" : "WORK",
    "countryCode" : "us"
  } ],
  "source" : "INBOUND_SALES_CALL",
  "status" : "CONTACTED",
  "statusChange" : 0,
  "linesOfBusiness" : [ "ANGULAR", "DEV_OPS" ],
  "company" : "Cloud City",
  "title" : "Friend",
  "authority" : false,
  "salesOwner" : {
    "id" : "5f78d8fbc1d3246ab4303f2b",
    "firstName" : "Darth",
    "lastName" : "Vader",
    "email" : "darth@xmail.com",
    "username" : "darth",
    "phoneNumber" : "474-555-1212"
  }
}, {
  "id" : "5fde1ac084dad94dbb7f82ae",
  "firstName" : "R2D2",
  "lastName" : "Droid",
  "email" : "r2d2@xmail.com",
  "source" : "EMAIL",
  "status" : "ACTIVE",
  "statusChange" : 0,
  "linesOfBusiness" : [ "JAVA_ENTERPRISE" ],
  "company" : "For Luke",
  "title" : "Droid",
  "authority" : false,
  "salesOwner" : {
    "id" : "5f78d8fbc1d3246ab4303f2b",
    "firstName" : "Darth",
    "lastName" : "Vader",
    "email" : "darth@xmail.com",
    "username" : "darth",
    "phoneNumber" : "474-555-1212"
  }
}, {
  "id" : "5fdd0af34e9d6806f369abf0",
  "firstName" : "Boba",
  "lastName" : "Fett",
  "email" : "boba@xmail.com",
  "phones" : [ {
    "phone" : "(555) 555-5555",
    "phoneType" : "HOME",
    "countryCode" : "us"
  } ],
  "addresses" : [ {
    "street1" : "1222 Galaxy Way",
    "city" : "Alterion",
    "state" : "AR",
    "zip" : "22222",
    "country" : "US",
    "addressType" : "HOME"
  } ],
  "source" : "INBOUND_SALES_CALL",
  "status" : "CONTACTED",
  "statusChange" : 0,
  "linesOfBusiness" : [ "DEV_OPS" ],
  "company" : "Empire",
  "title" : "Bounty Hunter",
  "authority" : false,
  "salesOwner" : {
    "id" : "5f78d8fbc1d3246ab4303f2b",
    "firstName" : "Darth",
    "lastName" : "Vader",
    "email" : "darth@xmail.com",
    "username" : "darth",
    "phoneNumber" : "474-555-1212"
  }
}, {
  "id" : "5fdd0cedaac5f75d62564ee7",
  "firstName" : "Jabba",
  "lastName" : "Hutt",
  "email" : "jabba@xmail.com",
  "source" : "EMAIL",
  "status" : "NEW",
  "statusChange" : 0,
  "linesOfBusiness" : [ "ANGULAR" ],
  "company" : "Sandz",
  "title" : "Large",
  "authority" : false,
  "salesOwner" : {
    "id" : "5f78d8fbc1d3246ab4303f2b",
    "firstName" : "Darth",
    "lastName" : "Vader",
    "email" : "darth@xmail.com",
    "username" : "darth",
    "phoneNumber" : "474-555-1212"
  }
}, {
  "id" : "5fdd0e7c870ef4713e179384",
  "firstName" : "Princess",
  "lastName" : "Leia",
  "email" : "leia@xmail.com",
  "phones" : [ {
    "phone" : "(555) 555-5555",
    "phoneType" : "WORK",
    "countryCode" : "us"
  } ],
  "source" : "WALKIN",
  "status" : "INTERESTED",
  "statusChange" : 0,
  "linesOfBusiness" : [ "JAVA_ENTERPRISE", "ANGULAR" ],
  "company" : "Republic",
  "title" : "Princess",
  "authority" : false,
  "salesOwner" : {
    "id" : "5f78d8fbc1d3246ab4303f2b",
    "firstName" : "Darth",
    "lastName" : "Vader",
    "email" : "darth@xmail.com",
    "username" : "darth",
    "phoneNumber" : "474-555-1212"
  }
}, {
  "id" : "5fd5f37bde602d3bacef69db",
  "firstName" : "Luke",
  "lastName" : "Skywalker",
  "email" : "luke@tat2.com",
  "source" : "EMAIL",
  "sourceDetails" : "He emailed me",
  "status" : "NEW",
  "statusChange" : 0,
  "linesOfBusiness" : [ "ANGULAR" ],
  "company" : "International Business",
  "title" : "President",
  "authority" : false,
  "salesOwner" : {
    "id" : "5f78d8fbc1d3246ab4303f2b",
    "firstName" : "Darth",
    "lastName" : "Vader",
    "email" : "darth@xmail.com",
    "username" : "darth",
    "phoneNumber" : "474-555-1212"
  }
}, {
  "id" : "5fdd0f2aea599836ca3ddbf1",
  "firstName" : "Han",
  "lastName" : "Solo",
  "email" : "han@xmail.com",
  "addresses" : [ {
    "street1" : "111 Millennium Way",
    "city" : "Nessy",
    "state" : "CO",
    "addressType" : "HOME"
  } ],
  "source" : "EMAIL",
  "status" : "ACTIVE",
  "statusChange" : 0,
  "title" : "Pirate",
  "authority" : false,
  "salesOwner" : {
    "id" : "5f78d8fbc1d3246ab4303f2b",
    "firstName" : "Darth",
    "lastName" : "Vader",
    "email" : "darth@xmail.com",
    "username" : "darth",
    "phoneNumber" : "474-555-1212"
  }
} ]

Now you want to get a count of each of those contacts by source. 

Create an initialization class that looks like this:

@Component
public class ApplicationListenerInitialize implements ApplicationListener<ApplicationReadyEvent>  {
	
	@Autowired
	private ContactService contactService;
	
    public void onApplicationEvent(ApplicationReadyEvent event) {
    	List<ContactInfo> contactInfo = contactService.countContactsBySource();
    	
    	try {
	    	ObjectMapper objectMapper = new ObjectMapper();
	    	objectMapper.setSerializationInclusion(JsonInclude.Include.NON_NULL);
	    	objectMapper.setSerializationInclusion(JsonInclude.Include.NON_EMPTY);
	    	System.err.println(objectMapper.writerWithDefaultPrettyPrinter().writeValueAsString(contactInfo));
    	} catch (Exception e) {
    		e.printStackTrace();
    	}
    }
}

Now launch your Spring Boot application. Once everything is loaded, you should see this output in a nice shade of red:

[ {
  "source" : "WEBSITE_FORM",
  "count" : 1
}, {
  "source" : "WALKIN",
  "count" : 2
}, {
  "source" : "INBOUND_SALES_CALL",
  "count" : 2
}, {
  "source" : "EMAIL",
  "count" : 4
} ]

Do the match from the previous output and you'll find it adds up perfectly.

Getting a Unique List

But wait! There's more!

You can do other stuff with that group operation as well. One thing you can do is return a unique list.

Let's say you just want a unique list of all sources. Use group to make that happen.

Here's a method that will do just that:

	public List<Contact> findDistinctSourceValues() {
		AggregationOperation group = Aggregation.group("source");
		
		Aggregation aggregation = Aggregation.newAggregation(group);

		List<Contact> contacts = mongoTemplate.aggregate(aggregation, mongoTemplate.getCollectionName(Contact.class), Contact.class).getMappedResults();
		
		return contacts;
	}

Not much of a pipeline there, eh? Just one stage.

And that's all you need.

Also, you can use Contact.class instead of ContactInfo.class because source is a field in the Contact class.

But you could also use ContactInfo.class and it would work as well.

Now update the initialization code to use this method:

@Component
public class ApplicationListenerInitialize implements ApplicationListener<ApplicationReadyEvent>  {
	
	@Autowired
	private ContactService contactService;
	
    public void onApplicationEvent(ApplicationReadyEvent event) {
    	List<Contact> contacts = contactService.findDistinctSourceValues();
    	
    	try {
	    	ObjectMapper objectMapper = new ObjectMapper();
	    	objectMapper.setSerializationInclusion(JsonInclude.Include.NON_NULL);
	    	objectMapper.setSerializationInclusion(JsonInclude.Include.NON_EMPTY);
	    	System.err.println(objectMapper.writerWithDefaultPrettyPrinter().writeValueAsString(contacts));
    	} catch (Exception e) {
    		e.printStackTrace();
    	}
    }
}

Run your Spring Boot application again and you should see this result:

[ {
  "id" : "WEBSITE_FORM"
}, {
  "id" : "WALKIN"
}, {
  "id" : "INBOUND_SALES_CALL"
}, {
  "id" : "EMAIL"
} ]

Boom. That's exactly what you'd expect.

By the way, if you'd like to run that same aggregation on your MongoDB client, here's what it would look like:

 db.contacts.aggregate([{ "$group" : { "_id" : "$source"}}]);

Once again, it's really simple.

How About an Actual Grouping?

If you're anything like me, you're probably thinking to yourself: "Hey, how about a grouping that actually groups stuff?"

Yeah. You can do that too.

Let's say, for example, that you'd like to group all contacts by source. But you don't need all their info. Just their names.

Get thee back into ContactService and add this method:

	public List<ContactInfo> groupContactsBySource() {
		AggregationOperation sort = Aggregation.sort(Direction.ASC, "lastName"); 
		AggregationOperation fullName = Aggregation.project("source").and("firstName").concat(" ", Aggregation.fields("lastName")).as("fullName");
		AggregationOperation group = Aggregation.group("source").push("fullName").as("contacts");
		AggregationOperation project = Aggregation.project("contacts").and("source").previousOperation();
		
		Aggregation aggregation = Aggregation.newAggregation(sort, fullName, group, project);

		List<ContactInfo> contactInfo = mongoTemplate.aggregate(aggregation, mongoTemplate.getCollectionName(Contact.class), ContactInfo.class).getMappedResults();
		
		return contactInfo;
	}

And now I must explain stuff.

The first stage (sort) sorts all contacts in the database by last name. That makes it easy to find the ones you're looking for.

The second stage (fullName) uses project to grab the source for each contact. Because that's the field you'll use for grouping.

But that stage doesn't stop there, does it? Nope.

That stage also creates a new field called "fullName." It's a concatenation of the contact's first name and last name with a space in between.

The "fullName" field gets passed to the next stage.

That next stage (group) is where the grouping happens. For each source, the aggregation will create an array of elements associated with that source.

What's in the array? It's a list of strings representing the contact's full name. That's the "fullName" field created in the previous stage.

You see the method push() there because that's what the stage is doing: it's literally pushing a full name string on to the array. It does that for each name that's associated with the source.

And then it saves the array as a field called "contacts." That's why the as() method is there.

Now that you've got the grouping, you just need to capture it so you can spit it out to the calling client. That's what the last stage (project) does.

If you run that same aggregation pipeline on your MongoDB client, it will look like this:

db.contacts.aggregate(
[
   {
      "$sort":{
         "lastName":1
      }
   },
   {
      "$project":{
         "source":1,
         "fullName":{
            "$concat":[
               "$firstName",
               " ",
               "$lastName"
            ]
         }
      }
   },
   {
      "$group":{
         "_id":"$source",
         "contacts":{
            "$push":"$fullName"
         }
      }
   },
   {
      "$project":{
         "contacts":1,
         "_id":0,
         "source":"$_id"
      }
   }
]
)

Now save what you got and add this initialization code:

@Component
public class ApplicationListenerInitialize implements ApplicationListener<ApplicationReadyEvent>  {
	
	@Autowired
	private ContactService contactService;
	
    public void onApplicationEvent(ApplicationReadyEvent event) {
    	List<ContactInfo> contactInfo = contactService.groupContactsBySource();
    	
    	try {
	    	ObjectMapper objectMapper = new ObjectMapper();
	    	objectMapper.setSerializationInclusion(JsonInclude.Include.NON_NULL);
	    	objectMapper.setSerializationInclusion(JsonInclude.Include.NON_EMPTY);
	    	System.err.println(objectMapper.writerWithDefaultPrettyPrinter().writeValueAsString(contactInfo));
    	} catch (Exception e) {
    		e.printStackTrace();
    	}
    }
}

Save that and restart Spring Boot. After everything loads, you should see this output:

[ {
  "source" : "EMAIL",
  "contacts" : [ "R2D2 Droid", "Jabba Hutt", "Luke Skywalker", "Han Solo" ]
}, {
  "source" : "INBOUND_SALES_CALL",
  "contacts" : [ "Lando Calrissian", "Boba Fett" ]
}, {
  "source" : "WALKIN",
  "contacts" : [ "JarJar Binks", "Princess Leia" ]
}, {
  "source" : "WEBSITE_FORM",
  "contacts" : [ "Chew Bacca" ]
} ]

Beautiful! It works!

Wrapping It Up

Those are just some of the things you can do with the group aggregate operation. Why don't you try a few more things?

Also, think about ways you can produce more complex groupings than the one you just created. For example, add some additional info about the contact.

Keep in mind: you can always just grab the source code from GitHub.

Have fun!

Photo by Belle Co from Pexels