Aggregations Modeled as Data Types

This is a practice that is rarely discussed. Let us examine it more closely.

Association

Association is the cardinal concept in programming that describes the relationship between two independent data types. It implies that two classes are related, but neither class owns the other. The lifecycles of the associated objects are independent. One object can exist without the other. Association represents a general “uses-a” or “knows-a” relationship. Examples: A student enrolls in a course. A teacher teaches a student. A customer places an order.

In OOP, associations are represented through method arguments within a class. For instance, the method Professor.teach(Student) expresses a relationship between a Professor and a Student. When we strip away the syntactic white-sugar of OOP, this association can be more clearly understood in functional terms: teach :: Professor -> Student. In this form, the association is effectively modeled as a function.

Aggregation

Aggregation is a special form of association that represents a “has-a” or “is-part-of” relationship. It implies a whole-part relationship where the “part” can exist independently of the “whole.” The “whole” (aggregate) object has a reference to the “part” (aggregated) object, but the “part” object’s lifecycle is not strictly tied to the “whole.” If the “whole” is destroyed, the “part” can still exist. The “part” can also be shared among multiple “wholes.” Examples: A professor belongs to a department. A playlist has a song. A team contains a player.

A common approach to implementing aggregations is through collections stored within a container object. For example, a Team class typically contains a collection of Player instances. This pattern is widely adopted and reinforced in many ORM tutorials, where relationships are often modeled in this way. At first glance, it seems intuitive: a team has a collection of players—doesn’t it?

Composition

Composition is a stronger form of association where one object contains another object, and the contained object cannot exist independently of the container. It is also a “has-a” or “is-part-of” relationship, but it’s a strong form of ownership. The “part” object belongs exclusively to the “whole” object and cannot exist independently. The “whole” (composite) object owns the “part” (component) object. The lifecycle of the “part” is tied to the “whole.” If the “whole” is destroyed, the “part” is also destroyed. A “part” cannot be shared among multiple “wholes.” Examples: House has a room. A book has chapters. Human has a heart.

Composition is implemented as an element of a data type. The House class contains various fields of type Room. I like to refer to this relationship as an “identifying” one, as the “parts” form the identity of the “whole.”

Critique of Aggregation Implementation

Let’s take a closer look at the TeamPlayer aggregation implementation.

When the Team class has a collection of Player objects, we assume an aggregation relationship between the Team and Player domain types. However, at the same time, this implies a strong, composite relationship between the team and a specific collection of players (TeamPlayer[]).

Using a collection in this way reflects a form of primitive obsession: a common code smell. Every time we use a collection, we implicitly omit the specification of what that collection represents. It could be all players, active players, the top five players, etc. Each of these is different, yet they are all modeled the same way: as a collection, distinguished only by the field or variable name.

From the Team’s perspective, there’s no difference between containing a Player[] or a single Player—both establish a strong, composite relationship. In the TeamPlayer[] model, while Team may act as an aggregate for Player, it becomes a composite with respect to the Player[]. This implicitly strengthens the relationship between these two domain types.

To summarize: using a collection to implement an aggregation between a “whole” and its “parts” effectively creates a composite relationship between the “whole” and the “part-collection.” And that’s problematic.

Aggregation is a Composite of Whole and Parts

The correct way to model aggregations is through a data type that contains both the “whole” and the collection of “parts.”

Rather than representing aggregations as raw collections, define a data type—e.g., ActivePlayers(Team, Player[]). The team no longer directly holds a collection of players. More important, the aggregate type now clearly communicates the nature of the relationship.

🧧
I am not defined by my opinions. We adopt, change, and refine our opinions, but they do not make us who we are. It matters less whether we agree and more whether we understand each other.
> BUY ME A COFFEE <