Welcome to the fourth article about Domain Driven Design. In the previous article, we learned about one of the primary concepts in Domain Driven Design – Entities. This time, we will take a look at another interesting and really powerful concept – Aggregates. They are often misunderstood, which results in incorrect implementations.
1.What is an Aggregate?
An aggregate is a cluster of associated objects (entities and value objects) that we treat as a unit for the purpose of data changes. We can say in other words that actual aggregate itself is the entire collection of objects that are connected together.
If you look at them just as clusters you might start questioning about them…
- Is Aggregate really just a cluster of related objects and entities?
- What happens if aggregate references another aggregate?
- Doesn’t this just lead to a massive Aggregate which becomes really hard to test and manage over time?
- How can aggregate have impact of speed performance and memory consumption
I will try to answer those questions along the way to make things clear(er).
Last time we started modelling our simple Task management app and ended up with modelling our Task Entity.
To spice things up let’s add some additional business requirements.
We are instructed to implement additional functionality:
- An image can be attached to task
- Every task can hold multiple images
- A task can be assigned to assignee
- Multiple assignees can be assigned to a single task
2. Implementing additional functionality as a large cluster
First we need define both objects which will represent an image attachment and an assignee. In example bellow you can see one of possible solutions. Besides properties that are needed to meet requirements listed above, I have also added unique identifier to both objects. Not sure when to and why to use ID’s? I suggest that you read my previous article about Entities to learn more about them.
public struct ImageAttachment {
public let id: UUID
public let image: Data
}
public struct User {
public let id: UUID
public let name: String
public let lastName: String
public let avatar: Data
}
Following specifications most straightforward forward solution would be to simply add set of assignees and attachments to TaskEntity.
/*
For easier following I this is simplified version of TaskEntity
from previous article containing only code that is related to this topic
*/
public class TaskEntity {
public let id: UUID
public private(set) var description: String
public private(set) var attachments: Set<ImageAttachment> = []
public private(set) var assignees: Set<User > = []
public func append(attachment: ImageAttachment) {
attachments.insert(attachment)
}
public func remove(attachment: ImageAttachment) {
attachments.remove(attachment)
}
public func assign(to assignee: User) {
assignees.insert(assignee)
}
public func remove(assignee: User) {
assignees.remove(assignee)
}
}
I have chosen to use set over array since order of elements is not important at this stage. On the other side though we get all benefits of sets for free – guaranteeing that there is no duplicates is one of them. Even though that this was not in specification, we need to take care of this kind of cases as well.
Almost every iOS application uses some sort of a list for visual representation of data, especially when trying to show larger amount of data. For us to be able to show it in a list we need some way to receive it first. Usually we load it either from database, disk, straight from the API or combination of both when using repositories.
You might be wondering how has anything of this to do with the model that we just created, but soon things will make more sense.
For the scope of this article I will focus only on example when loading data from disk.
2.1 Testing environment
To actually put our model under test, let’s say that we have a list of tasks where when will show a task cell with description of the task and image of one attachment. We are not directly interested in content of other attachments neither in avatar of the assignee or assignee whatsoever.
2.2 Weaknesses of this of approach
So we decided that that we are happy with our first implementation and we are ready to push code into production. But weeks / months later when we have app in production users start to complain about performance of the app. The list feels slower and less responsive. It also takes more time to initially load all tasks.
But why? Our model looks really simple, doesn’t it?
When implementing TaskEntity in first iteration we didn’t pay attention to scaling. If we look closely we allow TaskEntity to accept both infinite amount of attachments and assignees. From specifications side this is completely acceptable, even desirable, but it has unwanted side effects in our application.
Let’s say that on average that over time user of the app has approximately 100 tasks and every task has two assignees and three attachments. Meaning that when we load tasks for the list, we effectively load 200 avatars and 300 images. That’s 500(!) images that we need to load, and we possibly are also keeping them in memory only too populate the list. And we don’t even need them in the first place.
Now imagine in the real world app where both User and ImageAttachment have additional properties or maybe even some other memory intensive dependencies. We would need to configure and load them all as well. All this would be needed even when the UI would still remain the same.
See where this is going? This can go out of control pretty quickly.
As we can see core of the issue is that TaskEntity references other entities directly – it holds concrete models of User and ImageAttachment.
3. Referencing by identity
Remember when we added unique identifier to both User and ImageAttachment? This is when they come really handy with solving the issue.
To avoid coupling other entities with TaskEntity we can reference them by their identity instead.
public class TaskEntity {
public let id: UUID
public private(set) var description: String
public private(set) var attachments: Set<UUID> = []
public private(set) var assignees: Set<UUID> = []
public func append(attachment: UUID) {
attachments.insert(attachment)
}
public func remove(attachment: UUID) {
attachments.remove(attachment)
}
public func assign(to assignee: UUID) {
assignees.insert(assignee)
}
public func remove(assignee: UUID) {
assignees.remove(assignee)
}
}
We could just replace entity references in TaskEntity with UUID in both sets, but you can notice that now any UUID of any entity can be injected to append / remove / assign or remove method. We can easily pass attachment ID as an assignee. And we definitely don’t want to do that since it will lead into strange behaviour in the app.
Since we are using Swift I recommend using protocols to make our TaskEntity a bit less error prone by wrapping id’s inside of them.
public protocol TaskAssignee {
var id: UUID { get }
}
public protocol TaskAttachment {
var id: UUID { get }
}
public class TaskEntity {
public let id: UUID
public private(set) var description: String
public private(set) var attachments: Set<UUID> = []
public private(set) var assignees: Set<UUID> = []
public func append(attachment: TaskAttachment) {
attachments.insert(attachment.id)
}
public func remove(attachment: TaskAttachment) {
attachments.remove(attachment.id)
}
public func assign(to assignee: TaskAssignee) {
assignees.insert(assignee.id)
}
public func remove(assignee: TaskAssignee) {
assignees.remove(assignee.id)
}
}
Now all we need to do is to conform to both protocols so that we can still use same entities with combination of TaskEntity.
public struct User: TaskAssignee {
public let id: UUID
public let name: String
public let lastName: String
public let avatar: Data
}
public struct ImageAttachment: TaskAttachment {
public let id: UUID
public let image: Data
}
If we look at TaskEntity now we can see that it no longer holds direct references to other entities. Instead it holds only references to their IDs.
By reverting dependencies we have made it so that ZERO(!) images (or any associated objects with either User or ImageAttachment) need to be loaded when loading tasks. What we have are only their IDs for cases when we need to load them later on. TaskEntity still acts as as Aggregate since it still represents group of same informations as before, but now in a different form.
3.1 Bonus
By using TaskAttachment as a protocol we can now introduce new attachments to the system and attache them in a task without task knowing that they are different! As long as they conform to the protocol Task can accept them. In first example we would need to have multiple sets for different attachments.
4. Analysis and results
Now that we have seen how we can model our TaskEntity as Aggregate it would be appropriate to put both version of it into test and see how they compare when used in application.
For analysis I have created a simple app which simulated loading tasks from disk by decoding predefined JSON file (it represents one Task). For each attachment that it has it reads image from disk. Same thing is done for each assignee.
{
"id": "58e0a7d7-eebc-11d8-9669-0800200c9a60",
"title": "Lorem ipsum",
"description": "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua",
"attachments": [
{
"id": "58e0a7d7-eebc-11d8-9669-0800200c9a62"
},
{
"id": "58e0a7d7-eebc-11d8-9669-0800200c9a63"
},
{
"id": "58e0a7d7-eebc-11d8-9669-0800200c9a64"
},
{
"id": "58e0a7d7-eebc-11d8-9669-0800200c9a65"
}
],
"assignees": [
{
"id": "58e0a7d7-eebc-11d8-9669-0800200c9a66",
"name": "Assignee 1"
},
{
"id": "58e0a7d7-eebc-11d8-9669-0800200c9a67",
"name": "Assignee 2"
}
]
}
For loading itself I have created a separate component which handles all the loading – decoding and reading images from disk. Now here comes the fun part. After adding performance tests, just for loading (so no UI yet), here are some interesting results.
3. 1 Loading speed performance
3.1.1 Loading 25 tasks
3.1.2 Loading 100 tasks
4. 2 Memory consumption
Next thing that I was also interested in when doing analysis was how does this approach influence when listing Tasks in list. As mentioned earlier we need to list Tasks in a list and show its description and one attachment.
When loading models from first iteration we load all attachments upfront. But when referencing by ID, we need to load them on demand, which I also did in example app. For each cell that is about to be displayed, we can prefetch image from disk. And this is exactly what I did so that there UX is still the same and user doesn’t see the empty cell.
4.2.1 Listing 25 Tasks
4.2.2 Listing 100 Tasks
As you can see Aggregates, specially those that reference other entities by ID, have positive impact both on speed and memory consumption of the app ? Reason behind is that we don’t load and process so many objects upfront which are not needed after all. Magic behind is that we load only data when needed and releasing it when on demand. When clustering all into a single entity this is not possible.
And this is it for this first article regarding Aggregates. Today you have learned how to recognise Aggregates in your system and how they can affect overall app performance. I hope you enjoyed the article.
Thank you for reading!
In case of any questions or comments feel free to contact me or leave it comments section bellow.
Architectural Patterns – Decorator
In this article I will present you powerful architectural pattern that I have been using it a lot called Decorator.
1. What is Decorator
...Detect user’s country without accessing their location
Sometimes we are faced with challenge where we would like to improve user experience of our app based on where in the world or better said in which country ...