Rebuild queries from domain events with multiple aggregates
I am using DDD / CQRS / ES approach and I have some questions about modeling my aggregates and queries. Consider the following scenario as an example:
A user can create a WorkItem, change its title, and associate other users with it. The WorkItem has members (associated users) and the member can add Actions to the WorkItem. Participants can perform actions.
Let's assume that users have already been created and I only want userIds.
I have the following WorkItem commands:
- CreateWorkItem
- ChangeTitle
- AddParticipant
- AddAction
- ExecuteAction
These commands must be idempotent, so I cannot add the same user or action twice.
And the next request:
- WorkItemDetails (all information for a work item)
Requests are updated by handlers that process domain events associated with the WorkItem aggregate (s) (after they are stored in the EventStore). All of these events contain WorkItemId. I would like to be able to rebuild requests on the fly as needed, loading all the relevant events and processing them sequentially. This is because my users usually do not access WorkItems that were created a year ago, so I do not need to handle these requests. So when I get a request that doesn't exist, I could rebuild it and store it in the key / value store using TTL.
Domain events have an aggregateId (used as an event stream flow and shard key) and a sequenceId (used as an eventId in an event stream).
So my first attempt was to create a large aggregate called WorkItem that had a collection of members and a set of activities. Members and Activities are objects that only live inside a WorkItem. The principal refers to the userId and the action refers to the principalId. They may have more information, but that is not the case for this exercise. With this solution, my large WorkItem aggregate can ensure that the commands are idempotent because I can check that I am not adding duplicate members or actions, and if I want to rebuild the WorkItemDetails request, I simply load / handle all events for a given WorkItemID.
This works great because since I only have one aggregate the WorkItemId can be an aggregateId, so when I rebuild the query, I just load all events for the given WorkItemId. However, this solution has performance issues with a large aggregate (why load all members and actions to handle the ChangeTitle command?).
So my next attempt is to have different aggregates, all with the same WorkItemId as a property, but only the WorkItem aggregate has it as an aggregateId. This fixes performance issues, I can update the query because all events contain a WorkItemId, but now my problem is that I cannot rebuild it from scratch because I donβt know the aggregated data for other aggregates, so I cannot load their event streams and process them. They have a WorkItemId property, but this is not their real aggregateId. Also I cannot guarantee that I am handling events sequentially because each aggregate will have its own stream of events, but I am not sure if this is the real problem.
Another solution I can think of is creating a dedicated event stream to consolidate all WorkItem events associated with multiple aggregates. Therefore, I can have event handlers that simply add Participant-triggered events and Actions to an event stream that has an ID similar to "{workItemId}: allevents". This will only be used to recover the WorkItemDetails request. This sounds like a hack. Basically I am creating a "constellation" that has no business operations.
What other solutions do I have? Is it unusual to rebuild queries on the fly? Is it possible to do this if events for multiple aggregates (multiple event streams) are used to create the same query? I searched for this script and didn't find anything useful. I feel like I'm missing something that should be very obvious, but I didn't get what.
Any help on this is greatly appreciated.
thank
source to share
I don't think you should design your units with problems in mind. The Reading side is for this.
On the domain side, focus on issues of consistency (how small can the aggregate be and the domain still remains consistent in a single transaction), concurrency (how big can it be and not have concurrency / race conditions?) And performance (will we be load thousands of objects in memory just to execute a simple command?), exactly what you asked).
I don't see anything wrong with read-on-demand models. This is basically the same as reading from a live stream, except that you recreate the stream when you need it. However, this can be quite a lot of work and not an extraordinary benefit, because most of the time, requests are requested immediately after they are changed. If on demand becomes "basically every time the entity changes", you can subscribe to live changes as well. As far as "old" looks are concerned, the definition of "old" is that they are not modified anymore, so they don't need to be recalculated anyway, whether you have a system on demand or a continuous system.
If you are navigating several small aggregate routes and your Read Model needs information from multiple sources to update itself, you have several options:
-
Enrich emitted events with additional data
-
Reading from multiple streams of events and consolidating their data to create a reading model. No magic here, the Read side needs to know which aggregates are involved in a particular projection. You can also request other read models if you know they are relevant and will only provide you with the data you need.
See CQRS Events Do Not Provide Information Needed to Update the Reading Model
source to share