Event Design Patterns
It's incredible how much Thoughtworks contributes to our industry as a whole. Since Martin Fowler is the lead scientist for TW, people have the erroneous idea that he's the only one making progresses there.
Last night I learnt about a research of both Fowler and Ian Cartwright developed on Event Design Patterns. What's that? Well they established some patterns based on their vision of some problems they have in very real scenarios. The design patterns they talk about are:
- Event Sourcing
- Event Collaboration
- Parallel Model
- Retroactive Event
There are both benefits and worries added here. By using a more traditional Request/Response approach you also have benefits and worries to care for (Tradeoffs anyone?), so I'll just add my POV on the subject, being one that has discussed how I want my NMVP Plugins to behave, and I do believe deeply that what I had in mind for the Plugin collaboration was captured with great correctness by Fowler and Cartwright (cool calling them like that right? Makes them sound like friends, lol). As a sidenote, I hope Cartwright keeps on blogging, so we can see more insightful stuff.
I'll just follow the same order that they do in the video, which BTW I expect you saw before reading this. I won't discuss the Parallel or Retroactive patterns, only the first two. I'll probably discuss the latter in another post.
Event Sourcing
Well I really like the idea behind event sourcing, that is domain entities being hooked up on anyone who cares about their state, and then broadcasting events on state changes. This sounds very hard to achieve, but if you checked the video you can see that it's not. Now the other two advantages implied in this pattern, now that's a whole different game.
Rebuilding Domain Entities and Reversing Events hardly will be easy to implement. I'll tackle each at a time.
Domain Entities more than often are interdependent and dependent on external factors, like an exchange application in which I want to buy some currency, because the rate is low and I'd like to take advantage of that. So my application changes the state of some objects which are hooked up on my currency-buying service (call that what you like), so when the application broadcasts the CurrencyBought event (illustrative), the Service makes some financial transaction, that will hardly be "undoable". This issue is tackled on by Cartwright, but I don't think it's tackled enough. Being able to undo stuff has a REALLY high price for most enterprise applications. I do believe there are ways to mitigate this cost and I am VERY interested in conducting experiments in this field, but as of now I don't buy this as a real scale advantage of the Event Sourcing Pattern.
As he points out at around 3:18, there's the VERY REAL gain of performance brought by the Event Sourcing pattern, since everything now is made on demand. This is really inspiring, but I still have some concerns on it:
Persistence
Both the snapshot and the event sequence must be persisted on a case of hardware failure or something like that, so you can redo the steps based on some sequence of Events that you'd replay. Well, replaying itself has some issues like the Time Validity issue, but save it for later.
I find it very difficult to "patternize" persistence of this information, even though I guess people with more experience in this field would be able to. So, I won't try to come up with some way to do it. Rather I'll just discuss the main benefit and drawback of it IMHO, if you can get it right.
Well, you just bought yourself the best auditing trail available out there. It's the Ferrari of Auditing Trails. You'll always know how the system behaves at all times, but this comes at a very real cost. Since one of the main goals of this approach is performance, you'd have to have some really clever way of persisting this information, since as we all know, persistence is expensive. If you rely on plain old DB or ORM persistence, you're screwed, since the associated cost of persisting the events would pretty much render the whole thing pointless. So you'd have to come up with some amazing way of persisting stuff to a db store (pretty much what modern DB's do only persisting data to the disk in regular intervals, instead of at the moment you ask them too, so they get to do it in batches). This is not very trivial anyway.
Another advantage that Cartwright points out is the ability to replay some event sequence that happened in a production environment in development or acceptance environment. That might work for some scenarios, but for most of the enterprise applications, it probably won't work, since you'd have to have a VERY similar environment as the one in production, with the same data store, with the same state in the application and with the same code (which is hard per se, but with some good SCM strategies easily manageable). In the company I work for we have a customer that has a data store with about 60% of all the lawsuits for Telecomm in Brazil. So you can imagine that it's a lot of data right? That wouldn't be easy to replicate, and a lot of the time if you can't replicate it, the event sequence won't probably make sense anyway. I'd really like to hear more about the How and not the Why on this one. I find this a VERY nice feature to add to my applications, as I find it a rather hard one to achieve.
Complexity
My major concern here is that Event Sourcing introduces some level of complexity, since all the hooking up's gotta be done. The main issue with this is that it's indirect complexity: "Who's listening to my PurchaseOrderRequired event? Oh there's these 2 guys, well I'll change it and notify them."
Well if you got that right you'd probably won't have an issue, but if the hooking network starts to grow and become more intricate you can spell trouble right there, since you built a very decoupled application and still you will have a hard time changing it.
Of course there are ways to mitigate this, but I'd like too see more of how to avoid this in the pattern, since this would be my best guess of pitfall in the usage of this pattern.
I completely agree with him on the complexity introduced by the Restart of the application in the case of an unexpected failure. How would the application know it's a restart? When the failure happened what events were already completed? Can you just re-execute them?
Conclusion
I really like the Event Sourcing pattern, but I'd still have to dig into its implementation strategies and the complexity introduced by it to be able to give a more detailed opinion on it.
Event Collaboration
I did in my last couple posts some work on Event Collaboration. I don't actually know if it qualifies for it since the pattern is mostly aimed at Domain Objects and mine is aimed in plug-ins for the Presenters of the Interface Layer, but anyway.
As you can see in my last post, the plug-in infrastructure is all based in events, and each plugin just hooks into other Plugins events that tell him of some state change that interests him. This model is far superior to the Request/Reply one that would introduce a lot of complexity.
If you'd like to discuss more of the variation that I want to use in NMVP please reach me out, but I'll attain myself to the design pattern in question that concerns domain entities.
Well I really love the idea of events replacing commands, since it's a much cleaner solution, but then you have the cost of that, since commands are explicit, and events are implicit. Who's listening to the CreateOrder event? So if I change it what's the impact? It can be daunting to learn that in distributed applications (assuming the event model would extend itself to distributed applications). I guess, as always you'd have to weigh the pros and cons of this approach.
Now Queries disappearing is both my best dream and worst nightmare. I do understand that if I did follow the event lifecycle correctly, by now I have the most up-to-date instance of the domain entity. That's cool and all, but how about the stored version? How often do I update it? The update process can have a high cost on performance, so as I said before, you'd have to devise some sort of batching mechanism like modern databases, and if you worked on one of those you'd know that it's not really trivial to do them (lots of exceptions).
This scenario of non-query becomes even more complicated when you consider distributed apps. Who must do the above? How can I get an update on something that is not happening in my app domain? I'd love to discuss this issues further.
The Time of Value issue is exactly why I don't really buy the idea of taking some sequence of events out of their context (the production environment at a given point in time) and replaying them in a completely different context, given a few exceptions.
Time of Value states that some event lifecycle probably only has meaningful value during a given period of time, so it becomes pointless to repeat it after it has expired. I couldn't agree more with that notion.
Just as a closing issue for me, very well covered by Cartwright, the cascading of events. When you start to have an increasingly complex event lifecycle it's very easy to get yourself in an event loop, and it's VERY hard to spot one, even more in distributed apps. Since Mr. Cartwright covered this so nicely I'll just skip this. Check out some of his advice on avoiding this.
#117