Multi value fields and views aggregation: getting rid of unwanted duplication

Let's say that you are building out a "Person" content type that will contain several hundred people. Each person can optionally be tagged as an expert in one of several categories. Your "Person" content type fields might look something like this:

  • Name (required)
  • Bio
  • Headshot
  • Expert Categories

The view

The site that you're building requires that you show just the experts out of the entire set of People nodes. Easy, right?

Sure it is. Set up your view so that there is a filter to make sure that the Expert Categories field is filled out. I chose to test that the field is NOT NULL. Therefore, if the user entering Person nodes filled out any of the Expert Categories, that Person was an expert in that field, and they should appear on the experts list.

The problem

This method works fine with single value fields. But it completely fails with multi value fields. Why? For every Expert Category that Views finds that isn't NULL, it prints a row. Therefore, if a person is an expert in three fields, then they will show up three times.

The solution

Views aggregation. Yes, there are query settings for "Distinct" and there options in the filter to reduce duplicates. But those don't always work. The problem I ran into today involved creating a teaser view of three random experts. Once I added the Global:Random sort option, I got duplicates, even with those other options checked.

So, how can we set up Views aggregation to reduce duplicates? First, add the field "Content:Nid" to the list of fields and rearrange it so that it is on top. We are going to aggregate on this field since each of the duplicates will have the same node ID.

Next, turn on aggregation in the "Advanced" section under "Other." Check the aggregation settings for the node ID to make sure that it is set to "Group results together." Then check the aggregation settings for the other fields that you have added to make sure that "Aggregation Type" is set to "Group results together" and "Group column" is set to "Entity ID."

Read more at bojanz' helpful comment here: http://drupal.org/node/1200536#comment-4957410