The journeylism of @yreynhout

On CQRS, DDD(D), ES, …

Trench Talk – Authorization

For a long time now, I’ve been dealing with the subject of authorization and its bigger brother, authentication. Not that dealing with these makes me an expert, mind you. But it did arouse my interest in the areas of RBAC (role based access control) and family (hierarchical RBAC, administrative RBAC, attribute based RBAC, …), RAD (resource access decision) from OMG fame, XACML (eXtensible Access Control Markup Language) from OASIS, and a few others I’ve lost track of over the years. Why do we embed authorization into software, though? I can understand the need in military and nuclear facilities, or when company secrets are involved. But beyond that it’s mainly a game of fear, risk, trust and workflow. In general, in end-user facing systems, there are two forces at play: one is the top-down, enterprisy, we-want-control-over-who-sees-or-does-what and the other is the this-is-my-stuff-and-i-say-who-gets-to-see-it-and-manipulate-it. It’s not uncommon to find both in one software system you’re working on. The reason is quite simple: administration overhead. Imagine controlling each and every employee’s role(s), exceptions to such role(s), role transitions over time, etc … and you’ve got 90,000 of ‘em. While some organizations are – believe it or not – willing to pay money to their IT department (in a low-cost country, no doubt) to take care of this task, others have flattened their structure or adopted controlled delegation to overcome this. I guess, a lot depends on “the culture within”. Bottom line, to me, is that fine-grained, top-down control doesn’t scale very well – administratively speaking – and is entangled with topics of trust, risk, and – for those at the bottom – frustration. This is usually the point where we tend to see the “owner-centric” approach to authorization, where “owners” impose who gets to see or manipulate a particular resource or set of resources they own or co-own. If we’re lucky, that is.

Authorization models are fascinating. There have been many attempts at abstracting this domain and treating it as a “separate” problem. All in all we’ve been pretty successful at doing that. Think of all the products out there that incorporate the notion of storing and enforcing authorization “policies”. I’ve always been in environments where I couldn’t use off-the-shelf products for this particular problem. Mostly because it would have meant coupling to a particular technology and the inflexibility/inadequacy they brought with them. So, yes, in case you’re wondering, I’ve reinvented this wheel a couple of times. Why? Because, amongst other reasons, the data to base the access decision on was tightly coupled to the domain at hand. Depending on the required “freshness” of said data with regard to securing a particular resource you will want to design it one way or another.

What I like about Domain Driven Design is that it embraces the idea of entertaining multiple models to solve a particular problem. Of course, with choice comes responsibility, trying things out, failing, heck, even stupidity, but above all learning from those. Being focused and doing small, controlled experiments – verbally at the whiteboard or textual as in a code probe – may mitigate the risk of going down the wrong path. Such endeavors depend on the skill and experience of those involved, IMO. Yet by no means is success guaranteed. Such is life … how depressing. Still, let’s look at the bright side … of life, that is.

Make the implicit explicit

Let’s look at a small example … Imagine a piece of software that allowed one or more people to collaborate on an art design. Below I’ve coded up the use case of inviting somebody to collaborate with somebody else on a particular design.

public class InviteArtDesignCollaboratorHandler : Handles<InviteArtDesignCollaborator>
{
  //invite somebody to collaborate with you on a certain art design.
  public void Handle(InviteArtDesignCollaborator message)
  {
    var inviter = this.personRepository.Get(new PersonId(message.InviterId));
    var invitee = this.personRepository.Get(new PersonId(message.InviteeId));
    var artDesign = this.artDesignRepository.Get(new ArtDesignId(message.ArtDesignId));
    var invitation = artDesign.InviteCollaborator(inviter, invitee);
    this.invitationRepository.Add(invitation);
  }
}

If we zoom in on the authorization aspect of this use case, what do we see in the above code? Nothing, that’s right. Who can invite collaborators to an art design? The original creator of such a design IF and only IF he is still a participant of the collection the art design is part of. Who can be invited to an art design? Somebody who already participates in the collection the art design is part of. Obviously, I’m making this up as I go. Yet, these aren’t uncommon requirements to come across. Let’s see how this manifests itself intermingled with the above code.

public class InviteArtDesignCollaboratorHandler : Handles<InviteArtDesignCollaborator>
{
  //invite somebody to collaborate with you on a certain art design.
  public void Handle(InviteArtDesignCollaborator message)
  {
    var inviter = this.personRepository.Get(new PersonId(message.InviterId));
    var invitee = this.personRepository.Get(new PersonId(message.InviteeId));
    var artDesign = this.artDesignRepository.Get(new ArtDesignId(message.ArtDesignId));
    var collection = this.collectionRepository.Get(artDesign.CollectionId);
    if (!artDesign.WasOriginallyCreatedBy(inviter))
      throw new NotAuthorizedException("The inviter is not the original creator of the art design.");
    if (!collection.HasAsParticipant(inviter))
      throw new NotAuthorizedException("The inviter is not a participant of the collection the art design is part of.");
    if (!collection.HasAsParticipant(invitee))
      throw new NotAuthorizedException("The invitee is not a participant of the collection the art design is part of.");
    var invitation = artDesign.InviteCollaborator(inviter, invitee);
    this.invitationRepository.Add(invitation);
  }
}

What if we were to separate the authorization responsibility from the actual use case? Why would we do that? Does changing the rules about who can be invited and who can invite fundamentally change this use case? Aren’t the axis of change different for these two? If yes, then the next piece of code might make more sense. Mind you, it’s a matter of preference at this point. Requirements might push it more in one or the other direction as to “the code feeling natural”.

public class InviteArtDesignCollaboratorAuthorizer : Authorizes<InviteArtDesignCollaborator>
{
  //invite somebody to collaborate with you on a certain art design.
  public void Authorize(InviteArtDesignCollaborator message)
  {
    var inviter = this.personRepository.Get(new PersonId(message.InviterId));
    var invitee = this.personRepository.Get(new PersonId(message.InviteeId));
    var artDesign = this.artDesignRepository.Get(new ArtDesignId(message.ArtDesignId));
    var collection = this.collectionRepository.Get(artDesign.CollectionId);
    if (!artDesign.WasOriginallyCreatedBy(inviter))
      throw new NotAuthorizedException("The inviter is not the original creator of the art design.");
    if (!collection.HasAsParticipant(inviter))
      throw new NotAuthorizedException("The inviter is not a participant of the collection the art design is part of.");
    if (!collection.HasAsParticipant(invitee))
      throw new NotAuthorizedException("The invitee is not a participant of the collection the art design is part of.");
  }
}

public class InviteArtDesignCollaboratorHandler : Handles<InviteArtDesignCollaborator>
{
  //invite somebody to collaborate with you on a certain art design.
  public void Handle(InviteArtDesignCollaborator message)
  {
    var inviter = this.personRepository.Get(new PersonId(message.InviterId));
    var invitee = this.personRepository.Get(new PersonId(message.InviteeId));
    var artDesign = this.artDesignRepository.Get(new ArtDesignId(message.ArtDesignId));
    var invitation = artDesign.InviteCollaborator(inviter, invitee);
    this.invitationRepository.Add(invitation);
  }
}

We can now separate the authorization test specifications – who can perform the use case under what circumstances – from the actual use case specifications – what is the side effect of the use case, when can it happen/can’t it happen depending on current state. Is doing all this desirable? I guess it depends on what you are getting out of it.
Fundamentally, those query methods (“WasOriginallyCreatedBy”, “HasAsParticipant”) are using projected and provided state (input) to determine what their boolean return value should be. If we could live with relaxed consistency in this regard, we could even project these values “at another point in time” and use those instead of these query methods. Heck, we’d be able to ditch these query methods from our model altogether. But again, it largely depends on what kind of consistency you are expecting vis-à-vis the aggregate you are affecting. It’s also debatable whether we are dealing with a different model or not. After distillation, would these query methods still be there? Why? Are they essential or circumstantial? These aren’t questions a blog post can answer, I’m afraid. They are highly contextual.

Keep it simple when appropriate

In scenarios with simpler requirements we could choose an approach reminiscent of aspect oriented programming or pipes and filters. Below are a couple of examples. The assumption is that some piece of code will inspect the handler’s Handle method, look for the Authorization attribute and instantiate either a decorating handler or a pipeline component to represent the authorization based on the attribute’s properties.

public class StartSeasonPortfolioHandler : Handles<StartSeasonPortfolio>
{
  //start a new season portfolio for a subsidiary.
  [Authorize(Role="SubsidiaryAdministrators")]
  public void Handle(StartSeasonPortfolio message)
  {
    var subsidiaryId = new SubsidiaryId(message.SubsidiaryId);
    var season = Season.From(subsidiary, message.Season);
    if (this.portfolioRepository.HasPortfolioForSeason(subsidiaryId, season)) 
      throw new SeasonPortfolioAlreadyStartedException("A portfolio was already started for the specified season of the subsidiary.");
    var subsidiary = this.subsidiaryRepository(subsidiaryId);
    var portfolio = subsidiary.StartPortfolio(new PortfolioId(message.PortfolioId), new PortfolioName(message.Name), season);
    this.portfolioRepository.Add(porfolio);
  }
}

This example codifies the caller must be member of the role ‘SubsidiaryAdministrators’. Interestingly enough, you can’t really tell who “the caller” is from the above code, now can you? This was less obvious in the previous art design example. The assumption there was that the “inviter” was “the caller“. How do you know that for sure, though? It’s just data I could have easily spoofed. Alas, authentication – proving the caller’s identity – and message integrity – did somebody tamper with this message – are beyond the scope of what I want to discuss here (*). For now, let’s assume we know who the ambient caller is.

public class StartSeasonPortfolioHandler : Handles<StartSeasonPortfolio>
{
  //start a new season portfolio for a subsidiary.
  [Authorize(Permission="CanStartSeasonPortfolio")]
  public void Handle(StartSeasonPortfolio message)
  {
    //ommitted for brevity - same as above
  }
}

When dealing with fixed or hard coded roles becomes cumbersome administratively speaking, the next thing you’ll see is this shift in focus towards permission to perform a certain operation. It’s obviously easier, since I could be member of multiple, dynamic roles that have the CanStartSeasonPortfolio access decision set to allowed. Yet, what if multiple roles had conflicting access decisions? You’d need a strategy to resolve such issues. In finance, accounting, health care and military environments you may even encounter separation of duty.

public class StartSeasonPortfolioHandler : Handles<StartSeasonPortfolio>
{
  //start a new season portfolio for a subsidiary.
  [Authorize]
  public void Handle(StartSeasonPortfolio message)
  {
    //ommitted for brevity - same as above
  }
}

Command messages could prove to be natural authorization boundaries. In such case, the name of the message translates to a particular permission, being a very conventional way of modeling authorization. If only it was always this simple :-)

One thing that is very different from the art design example is that the notion of authorization wasn’t very coupled to the model we were dealing with. It felt somewhat more natural to tuck it away behind an attribute. There was no urge to consider it part of the model that dealt with the use case itself. Why is that? I think because no decisions were made based on state that was part of the model.

(*) If you’ve been infatuated with OAUTH, OpenID, Kerberos, PKI, etc. in the past, I doubt I could bring anything new to the table.

Not just for writing

Upon till now we’ve been very focused on preventing callers from invoking behavior they are not allowed to. However formidable that may be, I often find that’s only half of the story. When dealing with users or automated services, we want to guide them towards the happy path, no? We want to tell them: to the best of my knowledge, at the point in time you’re asking me, these are the state transitions you are allowed to perform. That’s when it hit me. The same authorization decisions you’re performing on the write side are also required on the read side, albeit in a form that could be very different. Obviously, we’re dealing with more latency here. As soon as you’ve computed or queried the access decision, it could be modified behind your back. It may have to travel across a network, it may have to manifest itself in a user interface (e.g. enable or disable a button/link) or await a user’s decision, it may manifest itself in an automated service choosing an alternate path through its logic. It’s a hint/clue, at best. Yet, often end users are quite content with these decisions.
That said, to me the key differentiator is the fact that permissions are not limited to state changing behavior. Whether or not you are allowed to view a particular piece of information is something that usually only manifests itself on the read side. Also note that the read side is often less harsh. It doesn’t throw an exception at you, rather it filters or hides information from you. Often I find that you’ll have a mixture of view and behavior permissions, especially on the read side.
Below you’ll find a very technology specific example of how that might manifest itself in a web api.

[RoutePrefix("portfolios")]
public class PortfolioResourceController : ApiController
{
  [Route("{id}")]
  public IHttpActionResult Get(string id)
  {
    var identity = new ClaimsIdentity(RequestContext.Principal.Identity);
    return this.portfolioQueries.
      ById(id).
      Select(_ => Ok(_.CompleteAuthorization(identity, this.Url))).
      DefaultIfEmpty(NotFound()).
      Single();
  }
}

public class PortfolioResource 
{
  public PortfolioResource CompleteAuthorization(ClaimsIdentity identity, UrlHelper helper)
  {
    var links = new List<Link>();
    //Note: CanEdit|Delete|ViewPortfolioItems are extension methods on ClaimsIdentity that, internally, 
    //      use claims to answer the authorization requests in a similar way as how the
    //      write side would ask them.
    if(identity.CanEditPortfolio(Id)) 
      links.Add(new Link { Rel = "edit", Href = helper.GetLink<PortfolioResourceController>(_ => _.Put(Id)) });
    if(identity.CanDeletePortfolio(Id)) 
      links.Add(new Link { Rel = "delete", Href = helper.GetLink<PortfolioResourceController>(_ => _.Delete(Id)) });
    if(identity.CanViewPortfolioItems(Id)) 
      links.Add(new Link { Rel = "items", Href = helper.GetLink<PortfolioResourceController>(_ => _.GetItems(Id)) });

    return new PortfolioResource(Id, Name, Season, SubsidiaryId, links.ToArray());
  }
}

What’s important to realize, is that we’ll probably start looking for similarity and deduplication (dare I say DRY) of the same logic that happens to be invoked on the write and read side, and trying to find a common home for them. Over the past months I saw an entire model emerge to deal with this and something as difficult as the art design example. Again, I’m pretty sure this isn’t the only way how I could’ve modeled it, but for now there were strong indicators this was a viable option. Below another, more convoluted – it’s the last, I promise – example of what I’ve come up with.

public class PortfolioAccessPolicy
{
  public bool CanEdit(Subject subject)
  {
    return 
      subject.CanEditPortfolio(RolePermissionSet) && 
      subject.IsStarterOfPortfolio(StarterId);
  }

  public bool CanDelete(Subject subject)
  {
    return 
      subject.CanDeletePortfolio(RolePermissionSet) && 
      subject.IsStarterOfPortfolio(StarterId);
  }

  public bool CanViewItems(Subject subject)
  {
    return 
      subject.CanViewPortfolioItems(RolePermissionSet) ||
      subject.IsPortfolioSupervisor();
  }
}

//Usage on the read side
public class PortfolioResource 
{
  public PortfolioResource CompleteAuthorization(Subject subject, PortfolioAccessPolicy policy, UrlHelper helper)
  {
    var links = new List<Link>();
    if(policy.CanEdit(subject)) 
      links.Add(new Link { Rel = "edit", Href = helper.GetLink<PortfolioResourceController>(_ => _.Put(Id)) });
    if(policy.CanDelete(subject)) 
      links.Add(new Link { Rel = "delete", Href = helper.GetLink<PortfolioResourceController>(_ => _.Delete(Id)) });
    if(policy.CanViewItems(subject)) 
      links.Add(new Link { Rel = "items", Href = helper.GetLink<PortfolioResourceController>(_ => _.GetItems(Id)) });

    return new PortfolioResource(Id, Name, Season, SubsidiaryId, links.ToArray());
  }
}

//Usage on the write side
public class EditPortfolioAuthorizer : Authorizes<EditPortfolio>
{
  public void Authorize(Subject subject, EditPortfolio message)
  {
    var policy = this.policyRepository.Get(new PortfolioId(message.PortfolioId));
    if(!policy.CanEditPortfolio(subject))
      throw new NotAuthorizedException("The caller is not authorized to edit this portfolio.");
  }
}

Conclusion

My only hope in writing this down is that it will make you think about how you’re modeling authorization and the realization that there is no one way of modeling it. Whether you’re rolling your own or dealing with something off the shelf, know that your requirements will dictate a lot. If you’re in an environment where reads and writes are separated but crave some commonality, know there are ways to share. I do not pretend to have all the answers, merely sharing what I’ve learned so far and what worked for me.

Suppository

Today I read Rob Conery’s piece entitled “Repositories On Top UnitOfWork Are Not a Good Idea“. While I respect Rob’s opinion and can even relate to the problems he touches upon, it made me somewhat sad as well. Why? Because it’s really telling what kind of problems people are solving using this and presumably other Domain Driven Design (DDD) tactical patterns. They are going through the motions, but they’re not getting much benefit from that investment. Probably why I’ve read so many blogposts stabbing the repository to death, with most of them seeking salvation in something closer to the metal they cherish. It’s safe to say I have a different view and opinion – what did you expect – about the benefits of the repository pattern.

If I had to summarize Domain Driven Design using 3 nouns, it’d be domain, model and language. Obviously, that doesn’t do justice to what DDD is all about, but it’s a large portion of its meat. Sadly, those 3 simple words are perceived as too abstract by many (I know this for a fact). No, I’m not going to explain them to you. There are enough Eric Evans videos going around for that purpose.
I do not tend to talk about them explicitly. But there are times, when discussing things as a team, I can point them out to you. They are, along with many other things, fundamental to doing DDD. The reason I bring them up is because many people have forgotten what the role and place of the repository is. A repository has its place in the toolbox used to carve out a model in code. It lets you deal with many issues, yet its shape won’t always be perfect or fit an actual textbook definition. Most often it’s just a seam giving you collection-like semantics. Sure, it’s handy when you can replace that seam for testing purposes, but that’s not its main motivation for being there. The needs of and the language in the model are what drive its api. Sometimes it’s used to help insulate the model from pesky technical details, horrendous legacy code or some form of elaborate translation of language we don’t want to pollute our model with. These aren’t enjoyable situations to be in, but a repository might be good enough to get the project moving on the model side and the integration side.
Collection oriented repostories aren’t the only kind. You can craft persistence oriented repositories as well, but you’ll have to be a good engineer as to not shoot yourself in the foot using them. Rob rightfully points out the transaction woes you can get in with those if you don’t watch out.
Another misconception is that repositories are somehow about views. Once you start using them for that purpose, you’re embracing a world of pain. Ultimately, reasons why you will want to combine CQ(R)S with DDD. Again, enough Greg Young videos going around for the finer details on that. I guess this is the point where I actually agree with Rob’s observations. So don’t do that. Views are hardly about solving a complex problem for which you’ll want to cultivate a language, flesh out a model over many iterations, have lots of communication and exploration.

The reason I position repositories this way is because I’m a model first guy (and no, not the Entity Framework kind). I care deeply about the things I develop from the perspective of the domain they serve. I care deeply about technical constraints, usability, performance, stability, and scalability of the software I build. It’s the journey of finding that balance between functional and non-functional requirements that is challenging and will make me bend the rules if need be. Decisions in this realm will influence what role a repository has to play, if any, in a piece of software and how it gets implemented. And yes, there are alternate – non DDD – ways of building software, which I’m perfectly okay with. It’s just not the defacto way I roll.

Summary?

Put repositories inside the model, give them an api that reflects the language of the model, use them to insulate your model from the bad and the ugly, and above all, don’t use them to feed your views.

As an aside: My friend Jef Claes already gave his point of view a few months ago, which relates to mine.

Trench Talk: Visual DSLs – the need for communication

As I’m slowly unwinding from a 5 year period of working on “something” in the area of electronic scheduling, I figured I’d share some of the experiences in modeling.

A word about domain experts

Domain experts, you know, the people that work in a particular domain, come in all shapes and sizes. We may not want to admit it, but it is rare to find a good domain expert. Maybe you do interact with a domain expert directly. If so, good for you. Maybe a Business Analyst (BA) is the closest you as a developer get to a domain expert. Does that mean we can’t practice domain driven design in that situation? Perhaps … to me this is mainly about interaction with people and how well that BA plays the role of “enabler” for us developers to get an accurate “enough” understanding of the domain, instead of “proxy with loss in translation”. The worst kind of domain expert proxy is probably a former techie. They have a tendency to bring the technical stuff into the conversation and don’t mind telling you how to do your job. You can imagine how much that rubs me the wrong way. But often that’s just the situation you are in. Are we still able to practice domain driven design at that point? I’m inclined to say no. It’s a Don Quichotte fight … deeper insight and breakthroughs are but a distant dream. That doesn’t mean there is zero value to be reaped in that situation, but still, don’t go shouting you’re doing domain driven design. There are circumstances in which a domain expert is somebody very knowledgable about the domain but not necessarily working in the domain. Especially if you’re building a product for a large number of customers, this person is probably in a better position than each one of those customers individually to see the bigger picture that aligns with the product “vision”. Simply put, they have most, if not all, the angles on why a feature is required.
This brings me to the most important question you need to ask domain experts: “Why?“. Why reveals either for what reason or purpose something is the way it is or illustrates the relationship between two events. It’s a fundamental building block in knowledge crunching and the distillation process. Don’t ask me why, but for me to build the right thing it’s of the utmost importance that I understand things first. Why is my shovel in digging for axioms. And like anything in business, axioms might shift into mutation.

Communication

It’s very easy for conversations with people knowledgable about the domain to go technical, to focus on the wrong area, to focus on the minute details instead of the bigger picture or vice versa, to use abstract terms where you think you get it but actually you don’t, to use an unwritten glossary without a sense of meaning or terminology deduplication, … How do you deal with that? You make things explicit. Boundaries, expectations, glossary, abstractions understood by all, to name a few. To use some of Eric Evan’s words: You cultivate a language, together (Ingredients of Effective Modeling, p12, Chapter 1: Crunching knowledge).
Yet words can only take us so far. Salvation, for me, comes from the cross-fertilization of the written/spoken word with visualization techniques. This should sound familiar. It’s the thing you do when you’re standing in front of a white board and you draw a little and you talk a little and you draw a little more, you scratch something, you engage in a discussion but try to stay focused, you simplify, you explore an alternate design, etc.

If you don’t want it to turn into a one-man show, just hand everybody a marker. There, solved it.

Over time a remarkable thing might start to happen: you’re drawing the same thing over and over again. Probably not exactly the same, sometimes only a particular part stays the same, sometimes it looks like you could decompose it into a bunch of reusable building blocks. I doubt many will notice unless they’re very reflective about what they do. Whenever this happens, it might be time to bring on the creativity and craft yourself a tangible, visual domain specific language (DSL).

A visual DSL

Although building a tangible, visual DSL is not a particularly costly undertaking, I do think there should be some return on investment. It’s waste when you can’t put it to good use, no? So, what do you need to create one? If we take it back to basics, often some paper or cardboard, a pair of scissors and some glue will do. You can get way more sophisticated than that: printable stencils, shapes emitted from your 3D printer, legos, … your imagination is the limit, as long as it doesn’t become about the tool rather than what it enables, i.e. better communication. Like with many things, experimentation is key. Let me whip up a little sample of what that all looked like.

An example

During my conversations with domain experts, I noticed I drew the notion of a timeline over and over again. Now, the timeline itself, below reduced to the scope of a day, was not the center piece of the conversation. It was more like a context or setting in which the conversation took place. The meaningful scenarios would all take place inside this virtual box.
Timeline
Obviously, the conversation would revolve around appointments – when, where, why, and how they were scheduled and who had scheduled them. Depending on what aspect or feature we were focusing on we’d just make the appointments themselves anonymous and just deal with their start time and duration, as shown below. The fact that these were just tangible pieces of paper made it “inviting” for all parties involved to start shuffling them around to mimick certain real life scenarios. If a certain appointment duration wasn’t available yet, paper and scissors came to the rescue.
IMG_3870
Now, we weren’t just dealing with appointments, we’d have blocked periods and unavailabilities as well(*). To make that difference more explicit, I’d use colour overlays like shown below, although coloured paper would ‘ve had the same effect. Adding a little legend – and sticking to it – made it easy for us all to have a conversation – now and in the future – about real life scenarios that involved these different concepts.

(*): don’t worry if you don’t get the difference, it’s domain specific anyway.

IMG_3872
These little cards – representing different concepts – were inviting to write on. It allowed us to get to a higher level of abstraction with the scenarios we were visualizing because we could capture more “language” in less space (compared to written sentences).
IMG_3876

In retrospect

The most useful aspect, by far, was that you could create photo stills of before and after situations – I doubt there’s a faster way to document scenarios. Gradually, scenarios evolved into “given, when, then”, which meshed well with how we were going to test them using code. Still, you had to be there to get what the scenarios were about. It wasn’t a substitute for communication. It was an enabler of efficient, terse, to the point conversation. When in doubt about a certain scenario, I could pull out a deck of cards and a piece of paper and I’d have my answer within minutes, not hours or days. I guess, to me, at the time, that’s were the value was, in developing a common language, aided by a useful model that kept the conversation focused. Over time, some of these visual DSLs would fade, having proved their usefulness when a feature was complete. That’s okay, it’s a tool, not an end.

Trench Talk: Assert.That(We.Understand());

After having written 2000+ eventsourcing-meets-aggregates specific given-when-then test specifications(*), you can imagine I started to notice both “problems” and “patterns”. Here’s a small overview …

(*) If you don’t know what those are, here’s a 50 minutes refresher for you: http://skillsmatter.com/podcast/design-architecture/talk-from-greg-young … no TL;DW;

Variations along the given and when axis

Ever written a set of tests where the when stays the same but the givens vary for each test case? Why is that? What are those tests communicating at that point? They’re testing the same behavior (and its outcome) over and over again, but each time the SUT is in a (potentially) different state. With these tests, one particular behavior is put in the spotlight.
When you are exploring scenarios – hopefully using a tangible, visual DSL and together with someone knowledgeable about the problem domain – you can uncover these variations. How? By gently steering the conversation in a direction where you keep asking what the outcome is of a certain behavior, each time changing the preconditions.

Similarly, ever written a set of tests where the givens stay the same but the when varies for each test case? Why is that? They’re testing different behavior, but each time the SUT is in the same state. From what I’ve seen, these tests are focusing on a particular state in the lifecycle of the SUT and constrain the behavior that can happen at that point in time.
Again, in your exploratory conversations you can uncover these cases by focusing on a certain state and asking what the outcome of each behavior and/or behavior variation would be.

Obviously, conversations don’t always go the way you want them to, but I just wanted to point out the importance of language, and how listening attentively and asking the right questions enables you to determine appropriate input so you can assert that you’ve covered most – if not all – variations.

Verbosity

Writing these tests, verbosity strikes in odd ways. It often becomes difficult for readers of the codified test specification to differentiate between the essential and secondary givens or thens. Both the “Extract method” refactoring and the use of test data builders help a lot in reducing that verbosity and in bringing back the readability of the test. There’s a parallel to this when generating printable/readable output based off of these test specifications for business people. Often this business oriented reader will not care for a lot of the minute details that are in the givens, when or thens. No, what he wants is a narrative that describes the specific scenario, emphasizing the important ingredients. So, how do we get that? Custom formatting. Not by overriding .ToString() on your messages, although you could still do that for other, more verbose purposes, but by associating a specific “narrative writer” with your message or scenario. I acknowledge this is pretty niche, but in my opinion it’s important if you want to stick to executable specifications and not revert to narrative on one side, code on the other side. High coupling is a desired property in this area.

Test data duplication

This may sound like it’s the same as verbosity, but it isn’t. It’s related, yes, but not the same thing. I started noticing the same data(**) being set up across a set of tests. Not a random set of tests, but a set of tests that focused on a certain feature, covered by one or more use cases. Basically, events and commands that were used in conjunction, with data that flowed from givens, over to when, into thens. Obviously, this is to be expected since it’s the very nature of why one writes these tests. Now, you’d think that test data builders solve this problem. I’m inclined to say yes, if you allow them to be authored for a set of tests. That’s a convoluted way of saying there are no general purpose test data builders that will work in each and every scenario. Now, you could move the variation that exists into those tests, but then you’d notice the duplication and, frankly, verbosity again. So, there’s some “thing” sitting in between the test data builders and that set of tests. I call it the Test Model, abbreviated just Model. It captures the data, the events and commands, using methods and properties as I see fit, and is used in each of those tests. Some tests may put forward mild variations, either inline or as explicit methods, of existing events and/or commands, but that’s okay. My gutt tells me I’m not at the end of the line here, but it’s as far as I’ve gotten.

(**) I might be using data and messages interchangeably. I blame the wine.

Isolation

This must be my favorite advantage of writing test specifications this way: the safety net that allows me to restructure things on the inside without my tests breaking because there is no coupling to an actual type. Refactoring on the inside does not affect any of the tests I’ve written using messages. I cherish this freedom enormously. Does that mean I don’t write any unit tests for the objects on the inside? No, but I consider those to be more brittle. That’s not a problem per sé, that’s just something you need to be aware of. Now, how far does that safety net stretch, you might wonder? As long as you don’t change the structure nor the semantics of the messages, I’d say pretty far. Again, these messages are capturing your assumptions and understanding, your conversation with that person that knows a thing or two about the business, better listen good, get them right and save yourself from a few refuctors.

Dependencies

Although I haven’t found much use for dependencies/services I needed to use on the inside, when I did, I found that this way of testing can work in full conjunction with mocking, stubbing, and/or faking. Why? Well, these dependencies are mostly important to the execution part of a specification, not its declaration. You can freely pass these along with the specification to the executor of the specification. The executor is responsible to make it available at the right place and the right time (***).

(***) Remind me again what an inversion of control container does? ;-)

Pipeline

Because the execution of test specifications is decoupled from its declaration, it’s well suited to make sure you’re not testing structurally invalid commands (or events if you want to go that far). Before executing your when, how hard would it be to validate that the command you’re executing is actually structurally valid? If your command’s constructor takes care of that you won’t even be able to complete the declaration (****). If you’ve made some other piece of code responsible for validating that type of command, then that’s what is wellsuited to hook into the test specification execution pipeline. The pipeline is also suited to print out a readable form of the test specification as it is being executed.

(****) I have my reservations vis-à-vis said approach, but to each his own.

Tests are data

At the end of the tunnel, this is the nirvana you reach. Why write these test specifications by hand when you can leverage that battalion of acceptance testers and end-users? Give them the tooling to record their scenarios. Sure, there’s still value in capturing those initial conversations and those initial scenarios, but think of all the real world variations that end-users are generating on a day to day basis. To me, this is not some wild dream, this is the path forward, albeit littered with a few maintainability obstacles in my mind, but nothing I’m not willing to cope with.
On a smaller scale, I found that if you leverage the test case generation features of your unit testing framework, you can already take baby steps in this direction. Think of the variations above. How hard would it be to consider either the set of givens or whens as nothing more than a bunch of test case data you feed into the same test? Think about how many lines of duplicate code that would save.

Conclusion

So there you have it, an experience report from the trenches. Overall, I’m very “content” having written tests this way. I’ve made a lot of mistakes along the way, but that was to be expected. It should come as no surprise that many of these hard learned lessons defined the shape of AggregateSource and AggregateSource.Testing.

Change is good

After more than 13 years my journey at UltraGenda has come to an end. I now know more than enough about scheduling in healthcare, building products not just projects, the importance of being part of an ecosystem as a product and not some little island, how to analyse problems, how to explore various designs, document those using volatile means and create working software off of those, making choices and trade offs along the way. But more importantly, I’ve learned to communicate the fruit of my brain, how to distinguish the many faces of change, how not everything is a software problem but sometimes a mentality, people or operations problem, how software often reflects the team that created it, why skills matter, both soft and hard, how I admire ambituous human beings. That and many other things … buy me a beer and I’ll tell you more.
Grateful is what I am … so Yoda-esque yet so true. Companies need to be enablers to bring out the best in their employees. Employees need to spot and seize opportunities yet be loyal. Symbiosis. I’m pretty sure that “being in the right environment” is the reason why I am who I am today, professionally speaking. Technology had little todo with it, really. An interesting domain that needed to grow on me, a great ambiance among collegues, the liberty to evolve are what kept me hooked for so long. So here’s saying thank you for all that.
As company takeovers took place and the dust clouds surrounding such events settled, I slowly but surely started to lose some of my connection & identity. I’m sure many of you know what that feels like, when the corporate landscape changes. Still, there’s an awesome busy bunch hard at work within those walls. They keep on producing kickass products and deliver topnotch support and service to an evergrowing customer base.

However, it’s time for me to spread my wings. From January 2014 on I’ll be working as an independent consultant/software moulder for as long as I can make a living off of it. BitTacklr is the name I’ll be trading under. Obviously there’s a website that goes by that name, as well as a twitter account. If you want to contact me for work, just drop me an email. My schedule is pretty full at the moment though (Q3/Q4 2014 earliest availability) ;-)

Event Enrichment

Event or more generally message enrichment, the act of adding information to a message, comes in many shapes and sizes. Below I scribbled down some of my thoughts on the subject.

Metadata

Metadata, sometimes referred to as out of band data, is typically represented separately from the rest of the payload. Reusing the underlying protocol/service is the sane thing to do. HTTP, SOAP, Amazon AWS/Windows Azure API calls, Event Store‘s event metadata, etc … are just a few examples of how metadata could be carried around next to the payload. Other examples are explicit envelopes or message wrappers, such as the Transmission Wrapper and the Intermediate Control Act Wrapper in the HL7 V3 specification.
From a code perspective (and from my experience) the addition of metadata tends to happen in one place. Close to the metal boundary if you will, e.g. http handlers, message listeners, application services, etc …

Separate concerns

Sometimes you need to augment a message with an extra piece of data, but it feels more natural to make another piece of code responsible for adding that data. A typical example I can think of are things that have a name but you only have an identifier to start with. In an event-sourced domain model(*), think of an aggregate that holds a soft reference (an identifier if you will) to another aggregate, but the event – about to be produced – would benefit from including the name of the referenced aggregate at that time. Many events may fall in the same boat. Having a dedicated piece of code doing the enrichment could make things more explicit in your code.
Other times you may have a model that is computationally intensive, and adding the extra data would result in sub optimal data-access. How so? Well, each operation would require say a query, possibly querying the same data over and over again. While caching could mitigate some of this, having another piece of code do the enrichment could allow you to deal with this situation more effectively (e.g. batching). Not only that, but it could also make it very explicit what falls into to the category of enrichment and what not. This nuance may become even more important when you’re working as a team.
Sometimes the internal shape of your event may not be the external shape of your event, meaning they may need less, more, or even different data for various consumers. Whether such a transformation really is an enrichment is debatable. But assuming it is, it’s just another one of those situations where enrichment makes sense.
An advantage of doing enrichment separately is that certain things never enter your model. As an example, an aggregate in an event-sourced domain model, does it really need the version it is at or its identity(**)? Could I not tack that information onto the related events as I’m about to persist them? Sure I can. No need to drag(**) those things into my model.
The enrichment could be implemented using an explicit event enricher that runs either synchronously in your event producing pipeline or asynchronously, depending on what makes the most sense. Using event builders has proven to be a killer combo.

public class BeerNameEnricher : 
  IEnrich<BeerRecipeDescribed>, 
  IEnrich<BeerRecipeAltered> {

  Dictionary<string, string> _lookupNameOfBeerUsingId;

  public BeerNameEnricher(Dictionary<string, string> lookupNameOfBeerUsingId) {
    _lookupNameOfBeerUsingId = lookupNameOfBeerUsingId;
  }

  public BeerRecipeDescribed Enrich(BeerRecipeDescribed @event) {
    return @event.UsingBeerNamed(_lookupNameOfBeerUsingId[@event.BeerId]);
  }

  public BeerRecipeAltered Enrich(BeerRecipeAltered @event) {
    return @event.UsingBeerNamed(_lookupNameOfBeerUsingId[@event.BeerId]);
  }
}

(*): Familiarity with Domain Driven Design is assumed.
(**): Not every situation warrants this.

EventBuilders – Revisited

Introduction

I’ve been using event builders for some time now. With time and practice comes experience (at least that’s the plan), both good and bad. Investing in event or – more general – message builders means first and foremost investing in language. If you’re not willing to put in that effort, don’t bother using them, at least not for the purpose of spreading them around your entire codebase. That piece of distilled wisdom applies equally to messages themselves, as far as I’m concerned. Builders are appealing because, if you do make the investment in embedding proper language, they can make your code a bit more readable. Granted, in a language like C# (at least the recent versions), that might have become less of a concern since things like object initializers and named arguments can go a long way to improve readability.

Builders are very similar to Test Data Builders. In fact, if you’re writing test specifications that use some sort of Given-When-Then syntax, builders can be useful to take the verbosity out of those specifications. You can tuck away test-specific, pre-initialized builders behind properties or behind test-suite specific classes. A lesser known feature of builders (especially the mutable kind – more about that below) is that you can pass them around, getting them to act as data collectors, because the state they need might not be available at their construction site(*). A well crafted model might rub this even more in your face (think about all the data in your value objects, entities, etc …). If you take a more functional approach – the fashionable thing to do these days – to passing them around, you can use immutable builders and hand back a new copy with the freshly collected values.

Messages, as in their representation in code, go hand in hand with serialization. There are many ways of doing serialization, with hand rolled, code generated and reflection based being the predominant ones. Sometimes the serialization library you depend upon comes with its own quirks. At that point, builders could be useful to insulate the rest of your code from having to know about those quirks or how to handle them. Whether you really need another “abstraction” sitting in between is debatable.

Composition is often overlooked when defining messages, resulting in flat, dictionary-like data dumpsters. Yet both json and xml – probably the most predominant textual serialization formats – allow by their very nature to define information in a hierarchic way, ergo composing messages from smaller bits of information. Not that I particularly believe that doing so is tied to the choosen serialization format. This is another area builders can help since they could be conceived around these highly cohesive bits of information, at least if your model has a similar shape (not that it has to). Message builders could then leverage message part builders or code could use message part builders to feed the proper data into message builders. This ties in nicely with the builder being a data collector.

(*) Construction site, i.e. the place where they are created.

A world of flavors

Below I’ll briefly glance over various flavors I’ve used myself or seen floating around. This post is a bit heavy on the (C#) code side of things and repetitive (on purpose, I might add). For demonstration purposes, I’ve shamelessly stolen and mutated a bit of code from The CQRS Journey.

Flavor 1 – Mutable event with getters and setters – no builder

namespace Flavor1_MutableEvent_GettersNSetters_NoBuilder {
  public class SeatTypeCreated {
    public Guid ConferenceId { get; set; }
    public Guid SeatTypeId { get; set; }
    public string Name { get; set; }
    public string Description { get; set; }
    public int Quantity { get; set; }

    // Optional ctor (if you're not happy with the datatype defaults)
    public SeatTypeCreated() {
      ConferenceId = Guid.Empty;
      SeatTypeId = Guid.Empty;
      Name = string.Empty;
      Description = string.Empty;
      Quantity = 0;
    }

    public override string ToString() {
      return string.Format("New seat type created '{0}' ({1}) for conference '{2}': {3}. Initial seating quantity is {4}.", Name, SeatTypeId, ConferenceId, Description, Quantity);
    }
  }

  public static class SampleUsage {
    public static void Show() {
      var _ = new SeatTypeCreated {
        ConferenceId = Guid.NewGuid(),
        SeatTypeId = Guid.NewGuid(),
        Name = "Terrace Level",
        Description = "Luxurious, bubblegum stain free seats.",
        Quantity = 25
      };

      Console.WriteLine(_);

      _.Quantity = 35;

      Console.WriteLine(_);
    }
  }
}

This is what I call the I’m in a hurry version where you don’t see nor feel the need to have builders and you’re not particularly worried about events/messages getting mutated. It’s still pretty descriptive due to the object intializers. There’s a time and place for everything.

Mutable messages are an anti-pattern. They are the path to a system that is held together with duct tape and bubble gum.” – Greg Young anno 2008.

I don’t know if Greg still feels as strongly about it today. You’ll have to ask him. Great quoting material is all I can say.

Flavor 2 – Immutable event with getters and readonly fields – no builder

namespace Flavor2_ImmutableEvent_GettersNReadOnlyFields_NoBuilder {
  public class SeatTypeCreated {
    readonly Guid _conferenceId;
    readonly Guid _seatTypeId;
    readonly string _name;
    readonly string _description;
    readonly int _quantity;

    public Guid ConferenceId {
      get { return _conferenceId; }
    }

    public Guid SeatTypeId {
      get { return _seatTypeId; }
    }

    public string Name {
      get { return _name; }
    }

    public string Description {
      get { return _description; }
    }

    public int Quantity {
      get { return _quantity; }
    }

    public SeatTypeCreated(Guid conferenceId, Guid seatTypeId, string name, string description, int quantity) {
      _conferenceId = conferenceId;
      _seatTypeId = seatTypeId;
      _name = name;
      _description = description;
      _quantity = quantity;
    }

    public override string ToString() {
      return string.Format("New seat type created '{0}' ({1}) for conference '{2}': {3}. Initial seating quantity is {4}.", _name, _seatTypeId, _conferenceId, _description, _quantity);
    }
  }

  public static class SampleUsage {
    public static void Show() {
      var _ = new SeatTypeCreated(
        conferenceId: Guid.NewGuid(),
        seatTypeId: Guid.NewGuid(),
        name: "Terrace Level",
        description: "Luxurious, bubblegum stain free seats.",
        quantity: 25
      );

      Console.WriteLine(_);

      var __ = new SeatTypeCreated(_.ConferenceId, _.SeatTypeId, _.Name, _.Description, 35);

      Console.WriteLine(__);
    }
  }
}

This is the immutable companion to the previous one. Named arguments cater for the readability in this one. A bit heavy on the typing if you want to mutate the event. It also implies you collect ALL information before you’re able to construct it.

Flavor 3 – Mutable event with getters and setters – implicit builder

namespace Flavor3_MutableEvent_GettersNSetters_ImplicitBuilder {
  public class SeatTypeCreated {
    public Guid ConferenceId { get; set; }
    public Guid SeatTypeId { get; set; }
    public string Name { get; set; }
    public string Description { get; set; }
    public int Quantity { get; set; }

    public SeatTypeCreated AtConference(Guid identifier) {
      ConferenceId = identifier;
      return this;
    }
    
    public SeatTypeCreated IdentifiedBy(Guid identifier) {
      SeatTypeId = identifier;
      return this;
    }
    
    public SeatTypeCreated Named(string value) {
      Name = value;
      return this;
    }
    
    public SeatTypeCreated DescribedAs(string value) {
      Description = value;
      return this;
    }
    
    public SeatTypeCreated WithInitialQuantity(int value) {
      Quantity = value;
      return this;
    }

    // Optional ctor (if you're not happy with the datatype defaults)
    public SeatTypeCreated() {
      ConferenceId = Guid.Empty;
      SeatTypeId = Guid.Empty;
      Name = string.Empty;
      Description = string.Empty;
      Quantity = 0;
    }

    public override string ToString() {
      return string.Format("New seat type created '{0}' ({1}) for conference '{2}': {3}. Initial seating quantity is {4}.", Name, SeatTypeId, ConferenceId, Description, Quantity);
    }
  }

  public static class SampleUsage {
    public static void Show() {
      var _ = new SeatTypeCreated {
          Quantity = 35 //does not do what you expect ...
        }.
        AtConference(Guid.NewGuid()).
        IdentifiedBy(Guid.NewGuid()).
        Named("Terrace Level").
        DescribedAs("Luxurious, bubblegum stain free seats.").
        WithInitialQuantity(25);

      Console.WriteLine(_);

      _.Quantity = 35;

      Console.WriteLine(_);
    }
  }
}

This is what I call the “AWS” variety since it’s what is being used in Amazon’s SDK for .NET. You get the readability of the builder with methods right on the message object itself and after each method you have access to the mutable instance of the message. What’s not to like, except for the mutability?

Flavor 4 – Mutable event with getters and setters – explicit builder

namespace Flavor4_MutableEvent_GettersNSetters_ExplicitBuilder {
  public class SeatTypeCreated {
    public Guid ConferenceId { get; set; }
    public Guid SeatTypeId { get; set; }
    public string Name { get; set; }
    public string Description { get; set; }
    public int Quantity { get; set; }

    // Optional ctor (if you're not happy with the datatype defaults)
    public SeatTypeCreated() {
      ConferenceId = Guid.Empty;
      SeatTypeId = Guid.Empty;
      Name = string.Empty;
      Description = string.Empty;
      Quantity = 0;
    }

    // Optional convenience method
    public SeatTypeCreatedBuilder ToBuilder() {
      return new SeatTypeCreatedBuilder().
        AtConference(ConferenceId).
        IdentifiedBy(SeatTypeId).
        Named(Name).
        DescribedAs(Description).
        WithInitialQuantity(Quantity);
    }

    public override string ToString() {
      return string.Format("New seat type created '{0}' ({1}) for conference '{2}': {3}. Initial seating quantity is {4}.", Name, SeatTypeId, ConferenceId, Description, Quantity);
    }
  }

  public class SeatTypeCreatedBuilder {
    Guid _conferenceId;
    Guid _seatTypeId;
    string _name;
    string _description;
    int _quantity;

    // Optional ctor (if you're not happy with the datatype defaults)
    public SeatTypeCreatedBuilder() {
      _conferenceId = Guid.Empty;
      _seatTypeId = Guid.Empty;
      _name = string.Empty;
      _description = string.Empty;
      _quantity = 0;
    }

    public SeatTypeCreatedBuilder AtConference(Guid identifier) {
      _conferenceId = identifier;
      return this;
    }

    public SeatTypeCreatedBuilder IdentifiedBy(Guid identifier) {
      _seatTypeId = identifier;
      return this;
    }

    public SeatTypeCreatedBuilder Named(string value) {
      _name = value;
      return this;
    }

    public SeatTypeCreatedBuilder DescribedAs(string value) {
      _description = value;
      return this;
    }

    public SeatTypeCreatedBuilder WithInitialQuantity(int value) {
      _quantity = value;
      return this;
    }

    public SeatTypeCreated Build() {
      return new SeatTypeCreated {
        ConferenceId = _conferenceId,
        SeatTypeId = _seatTypeId,
        Name = _name,
        Description = _description,
        Quantity = _quantity
      };
    }
  }

  public static class SampleUsage {
    public static void Show() {
      var _ = new SeatTypeCreatedBuilder().
        AtConference(Guid.NewGuid()).
        IdentifiedBy(Guid.NewGuid()).
        Named("Terrace Level").
        DescribedAs("Luxurious, bubblegum stain free seats.").
        WithInitialQuantity(25).
        Build();

      Console.WriteLine(_);

      _.Quantity = 35;

      Console.WriteLine(_);

      var __ = _.ToBuilder().
        WithInitialQuantity(45).
        Build();

      Console.WriteLine(__);
    }
  }
}

This is just a simple variation on the above with the builder pulled out of the message object. Might come in handy if you don’t have control over the messages but you still fancy builders (the ToBuilder could become an extension method in that case).

Flavor 5 – Immutable event with getters and readonly fields – implicit builder

namespace Flavor5_ImmutableEvent_GettersNReadOnlyFields_ImplicitBuilder {
  public class SeatTypeCreated {
    readonly Guid _conferenceId;
    readonly Guid _seatTypeId;
    readonly string _name;
    readonly string _description;
    readonly int _quantity;

    public Guid ConferenceId {
      get { return _conferenceId; }
    }

    public Guid SeatTypeId {
      get { return _seatTypeId; }
    }

    public string Name {
      get { return _name; }
    }

    public string Description {
      get { return _description; }
    }

    public int Quantity {
      get { return _quantity; }
    }

    public SeatTypeCreated() {
      _conferenceId = Guid.Empty;
      _seatTypeId = Guid.Empty;
      _name = string.Empty;
      _description = string.Empty;
      _quantity = 0;
    }

    SeatTypeCreated(Guid conferenceId, Guid seatTypeId, string name, string description, int quantity) {
      _conferenceId = conferenceId;
      _seatTypeId = seatTypeId;
      _name = name;
      _description = description;
      _quantity = quantity;
    }

    public SeatTypeCreated AtConference(Guid identifier) {
      return new SeatTypeCreated(identifier, _seatTypeId, _name, _description, _quantity);
    }

    public SeatTypeCreated IdentifiedBy(Guid identifier) {
      return new SeatTypeCreated(_conferenceId, identifier, _name, _description, _quantity);
    }

    public SeatTypeCreated Named(string value) {
      return new SeatTypeCreated(_conferenceId, _seatTypeId, value, _description, _quantity);
    }

    public SeatTypeCreated DescribedAs(string value) {
      return new SeatTypeCreated(_conferenceId, _seatTypeId, _name, value, _quantity);
    }

    public SeatTypeCreated WithInitialQuantity(int value) {
      return new SeatTypeCreated(_conferenceId, _seatTypeId, _name, _description, value);
    }

    public override string ToString() {
      return string.Format("New seat type created '{0}' ({1}) for conference '{2}': {3}. Initial seating quantity is {4}.", _name, _seatTypeId, _conferenceId, _description, _quantity);
    }
  }

  public static class SampleUsage {
    public static void Show() {
      var _ = new SeatTypeCreated().
        AtConference(Guid.NewGuid()).
        IdentifiedBy(Guid.NewGuid()).
        Named("Terrace Level").
        DescribedAs("Luxurious, bubblegum stain free seats.").
        WithInitialQuantity(25);

      Console.WriteLine(_);

      var __ = _.WithInitialQuantity(35);

      Console.WriteLine(__);
    }
  }
}

This is an immutable version of the “AWS” variety, giving you a new event upon each call. Great if you intend to have different branches off of the same message, such as for testing purposes. Pushes you down the functional alley if you want to collect data since each mutation gives you a new instance. It’s probably my favorite when dealing with flat datastructures. I’m no expert, but I’m pretty sure the immutable versions are going to use more memory (or at least annoy GC’s GEN 0). Whether that’s something you should be overly focused on highly depends on your particular context.

“In the land of IO, the blind man, who doesn’t measure, optimizes for the wrong thing first.” – Yves anno 2013

Flavor 6 – Immutable event with getters and readonly fields – Mutable explicit builder

namespace Flavor6_ImmutableEvent_GettersNReadOnlyFields_MutableExplicitBuilder {
  public class SeatTypeCreated {
    readonly Guid _conferenceId;
    readonly Guid _seatTypeId;
    readonly string _name;
    readonly string _description;
    readonly int _quantity;

    public Guid ConferenceId {
      get { return _conferenceId; }
    }

    public Guid SeatTypeId {
      get { return _seatTypeId; }
    }

    public string Name {
      get { return _name; }
    }

    public string Description {
      get { return _description; }
    }

    public int Quantity {
      get { return _quantity; }
    }

    public SeatTypeCreated(Guid conferenceId, Guid seatTypeId, string name, string description, int quantity) {
      _conferenceId = conferenceId;
      _seatTypeId = seatTypeId;
      _name = name;
      _description = description;
      _quantity = quantity;
    }

    // Optional convenience method
    public SeatTypeCreatedBuilder ToBuilder() {
      return new SeatTypeCreatedBuilder().
        AtConference(ConferenceId).
        IdentifiedBy(SeatTypeId).
        Named(Name).
        DescribedAs(Description).
        WithInitialQuantity(Quantity);
    }

    public override string ToString() {
      return string.Format("New seat type created '{0}' ({1}) for conference '{2}': {3}. Initial seating quantity is {4}.", _name, _seatTypeId, _conferenceId, _description, _quantity);
    }
  }

  public class SeatTypeCreatedBuilder {
    Guid _conferenceId;
    Guid _seatTypeId;
    string _name;
    string _description;
    int _quantity;

    // Optional ctor (if you're not happy with the datatype defaults)
    public SeatTypeCreatedBuilder() {
      _conferenceId = Guid.Empty;
      _seatTypeId = Guid.Empty;
      _name = string.Empty;
      _description = string.Empty;
      _quantity = 0;
    }

    public SeatTypeCreatedBuilder AtConference(Guid identifier) {
      _conferenceId = identifier;
      return this;
    }

    public SeatTypeCreatedBuilder IdentifiedBy(Guid identifier) {
      _seatTypeId = identifier;
      return this;
    }

    public SeatTypeCreatedBuilder Named(string value) {
      _name = value;
      return this;
    }

    public SeatTypeCreatedBuilder DescribedAs(string value) {
      _description = value;
      return this;
    }

    public SeatTypeCreatedBuilder WithInitialQuantity(int value) {
      _quantity = value;
      return this;
    }

    public SeatTypeCreated Build() {
      return new SeatTypeCreated(_conferenceId, _seatTypeId, _name, _description, _quantity);
    }
  }

  public static class SampleUsage {
    public static void Show() {
      var _ = new SeatTypeCreatedBuilder().
        AtConference(Guid.NewGuid()).
        IdentifiedBy(Guid.NewGuid()).
        Named("Terrace Level").
        DescribedAs("Luxurious, bubblegum stain free seats.").
        WithInitialQuantity(25).
        Build();

      Console.WriteLine(_);

      var builder = _.ToBuilder();

      var __ = builder.
        Named("Balcony level").
        DescribedAs("High end seats. No smoking policy.").
        WithInitialQuantity(45).
        Build();

      Console.WriteLine(__);

      var ___ = builder.
        WithInitialQuantity(45).
        Build(); //Probably not the result you expect

      Console.WriteLine(___);
    }
  }
}

This kind of builder is great if you want to collect data using the same instance before producing the immutable message.

Flavor 7 – Immutable event with getters and readonly fields – Immutable explicit builder

namespace Flavor7_ImmutableEvent_GettersNReadOnlyFields_ImmutableExplicitBuilder {
  public class SeatTypeCreated {
    readonly Guid _conferenceId;
    readonly Guid _seatTypeId;
    readonly string _name;
    readonly string _description;
    readonly int _quantity;

    public Guid ConferenceId {
      get { return _conferenceId; }
    }

    public Guid SeatTypeId {
      get { return _seatTypeId; }
    }

    public string Name {
      get { return _name; }
    }

    public string Description {
      get { return _description; }
    }

    public int Quantity {
      get { return _quantity; }
    }

    public SeatTypeCreated(Guid conferenceId, Guid seatTypeId, string name, string description, int quantity) {
      _conferenceId = conferenceId;
      _seatTypeId = seatTypeId;
      _name = name;
      _description = description;
      _quantity = quantity;
    }

    // Optional convenience method
    public SeatTypeCreatedBuilder ToBuilder() {
      return new SeatTypeCreatedBuilder().
        AtConference(ConferenceId).
        IdentifiedBy(SeatTypeId).
        Named(Name).
        DescribedAs(Description).
        WithInitialQuantity(Quantity);
    }

    public override string ToString() {
      return string.Format("New seat type created '{0}' ({1}) for conference '{2}': {3}. Initial seating quantity is {4}.", _name, _seatTypeId, _conferenceId, _description, _quantity);
    }
  }

  public class SeatTypeCreatedBuilder {
    readonly Guid _conferenceId;
    readonly Guid _seatTypeId;
    readonly string _name;
    readonly string _description;
    readonly int _quantity;

    // Optional ctor content (if you're not happy with the datatype defaults)
    public SeatTypeCreatedBuilder() {
      _conferenceId = Guid.Empty;
      _seatTypeId = Guid.Empty;
      _name = string.Empty;
      _description = string.Empty;
      _quantity = 0;
    }

    SeatTypeCreatedBuilder(Guid conferenceId, Guid seatTypeId, string name, string description, int quantity) {
      _conferenceId = conferenceId;
      _seatTypeId = seatTypeId;
      _name = name;
      _description = description;
      _quantity = quantity;
    }

    public SeatTypeCreatedBuilder AtConference(Guid identifier) {
      return new SeatTypeCreatedBuilder(identifier, _seatTypeId, _name, _description, _quantity);
    }

    public SeatTypeCreatedBuilder IdentifiedBy(Guid identifier) {
      return new SeatTypeCreatedBuilder(_conferenceId, identifier, _name, _description, _quantity);
    }

    public SeatTypeCreatedBuilder Named(string value) {
      return new SeatTypeCreatedBuilder(_conferenceId, _seatTypeId, value, _description, _quantity);
    }

    public SeatTypeCreatedBuilder DescribedAs(string value) {
      return new SeatTypeCreatedBuilder(_conferenceId, _seatTypeId, _name, value, _quantity);
    }

    public SeatTypeCreatedBuilder WithInitialQuantity(int value) {
      return new SeatTypeCreatedBuilder(_conferenceId, _seatTypeId, _name, _description, value);
    }

    public SeatTypeCreated Build() {
      return new SeatTypeCreated(_conferenceId, _seatTypeId, _name, _description, _quantity);
    }
  }

  public static class SampleUsage {
    public static void Show() {
      var _ = new SeatTypeCreatedBuilder().
        AtConference(Guid.NewGuid()).
        IdentifiedBy(Guid.NewGuid()).
        Named("Terrace Level").
        DescribedAs("Luxurious, bubblegum stain free seats.").
        WithInitialQuantity(25).
        Build();
      Console.WriteLine(_);

      var builder = _.ToBuilder();

      var __ = builder.
        Named("Balcony level").
        DescribedAs("High end seats. No smoking policy.").
        WithInitialQuantity(45).
        Build();

      Console.WriteLine(__);

      var ___ = builder.
        WithInitialQuantity(45).
        Build(); //The result you expect

      Console.WriteLine(___);
    }
  }
}

This one is only useful in the odd case you’d like to do branching off of the builder. It might be worthy to read Greg’s musings on this and the previous one.

Flavor 8 – Immutable event with getters and readonly fields – Implicit builder combined with mutable explicit builder

namespace Flavor8_ImmutableEvent_GettersNReadOnlyFields_ImplicitBuilderCombinedWithMutableExplicitBuilder {
  public class SeatTypeCreated {
    readonly Guid _conferenceId;
    readonly Guid _seatTypeId;
    readonly string _name;
    readonly string _description;
    readonly int _quantity;

    public Guid ConferenceId {
      get { return _conferenceId; }
    }

    public Guid SeatTypeId {
      get { return _seatTypeId; }
    }

    public string Name {
      get { return _name; }
    }

    public string Description {
      get { return _description; }
    }

    public int Quantity {
      get { return _quantity; }
    }

    public SeatTypeCreated() {
      _conferenceId = Guid.Empty;
      _seatTypeId = Guid.Empty;
      _name = string.Empty;
      _description = string.Empty;
      _quantity = 0;
    }

    SeatTypeCreated(Guid conferenceId, Guid seatTypeId, string name, string description, int quantity) {
      _conferenceId = conferenceId;
      _seatTypeId = seatTypeId;
      _name = name;
      _description = description;
      _quantity = quantity;
    }

    public SeatTypeCreated AtConference(Guid identifier) {
      return new SeatTypeCreated(identifier, _seatTypeId, _name, _description, _quantity);
    }

    public SeatTypeCreated IdentifiedBy(Guid identifier) {
      return new SeatTypeCreated(_conferenceId, identifier, _name, _description, _quantity);
    }

    public SeatTypeCreated Named(string value) {
      return new SeatTypeCreated(_conferenceId, _seatTypeId, value, _description, _quantity);
    }

    public SeatTypeCreated DescribedAs(string value) {
      return new SeatTypeCreated(_conferenceId, _seatTypeId, _name, value, _quantity);
    }

    public SeatTypeCreated WithInitialQuantity(int value) {
      return new SeatTypeCreated(_conferenceId, _seatTypeId, _name, _description, value);
    }

    public Builder ToBuilder() {
      return new Builder().
        AtConference(ConferenceId).
        IdentifiedBy(SeatTypeId).
        Named(Name).
        DescribedAs(Description).
        WithInitialQuantity(Quantity);
    }

    public override string ToString() {
      return string.Format("New seat type created '{0}' ({1}) for conference '{2}': {3}. Initial seating quantity is {4}.", _name, _seatTypeId, _conferenceId, _description, _quantity);
    }

    public class Builder {
      Guid _conferenceId;
      Guid _seatTypeId;
      string _name;
      string _description;
      int _quantity;

      // Optional ctor (if you're not happy with the datatype defaults)
      internal Builder() {
        _conferenceId = Guid.Empty;
        _seatTypeId = Guid.Empty;
        _name = string.Empty;
        _description = string.Empty;
        _quantity = 0;
      }

      public Builder AtConference(Guid identifier) {
        _conferenceId = identifier;
        return this;
      }

      public Builder IdentifiedBy(Guid identifier) {
        _seatTypeId = identifier;
        return this;
      }

      public Builder Named(string value) {
        _name = value;
        return this;
      }

      public Builder DescribedAs(string value) {
        _description = value;
        return this;
      }

      public Builder WithInitialQuantity(int value) {
        _quantity = value;
        return this;
      }

      public SeatTypeCreated Build() {
        return new SeatTypeCreated(_conferenceId, _seatTypeId, _name, _description, _quantity);
      }
    }
  }

  public static class SampleUsage {
    public static void Show() {
      var _ = new SeatTypeCreated().
        AtConference(Guid.NewGuid()).
        IdentifiedBy(Guid.NewGuid()).
        Named("Terrace Level").
        DescribedAs("Luxurious, bubblegum stain free seats.").
        WithInitialQuantity(25);
      Console.WriteLine(_);

      var builder = _.ToBuilder();

      var __ = builder.
        Named("Balcony level").
        DescribedAs("High end seats. No smoking policy.").
        WithInitialQuantity(45).
        Build();

      Console.WriteLine(__);

      var ___ = _.
        WithInitialQuantity(45); //The result you expect

      Console.WriteLine(___);
    }
  }
}

A slight variation that may prove useful if you know by convention that builders are mutable and events/messages are not. This would allow you to pass a builder around whenever opportunity knocks and get back to the immutable shape when you’re done using the builder. I’ll spare you flavor 9 which combines explicit mutable and immutable builders with implicit builders.

Congratulations, achievement unlocked “scrolled to the end“.

Conclusion

By no means is this list of flavors finite. Other programming languages might have constructs which make authoring this way easier. Overall, I’m quite content and comfortable with using builders, having shifted somewhat more to the immutable side of the flavor spectrum. I’m pretty sure they’re not for everybody, and that’s just fine. The choice of which flavor to use highly depends on the scenario at hand, how comfortable a team is with them in general, what benefits could be gotten from them. I also feel you shouldn’t try to force the same type of builder on all events/messages since simplicity might not warrant their perceived complexity. A consistent approach is not something I’m particularly fond of since it’s mainly driven by the human desire to do everything the same way, not by rationalizing about what would be the best choice in a particular situation. Then again, I probably think too much – pretty sure some will think I’ve gone bananas. Maybe I have … what a way to end a post. Later!

A role to play

Every so often someone new arrives at the DDD/CQRS list (*) and topics such as set based validation rear their head, resulting in near-endless threads of discussion and coming to a common understanding. A topic that isn’t as often discussed is one of roles in the domain model and how that would work in combination with event sourcing. If you want to read up on roles, Mark Seemann has some great posts on that topic on his blog, albeit in a slightly different context. There’s also this video by Udi Dahan about making roles explicit, which is more akin to what I’ll be touching upon here. Fanatics of whitepapers, might get their brain washed by papers like “Role Interfaces“, “The Role Object Pattern“, “Modeling Roles” or “Mock Roles, not Objects“.

Let’s make it practical

Suppose I’m building a Realtor app that has a finite number of real estate property types (think apartment, villa, house, warehouse, etc…). I could model each type of property as a class/type (**) putting specific behavior on each of them. I’d end up with Apartment, Villa, House and Warehouse as aggregate root entity types (and hence as aggregates). What if I had some common behavior that applies to all of them, where the calling code has a desire to be ignorant of the specific type, i.e. it is interested in the role that a property plays, not the specific type of property it represents. Let’s call that role ‘Property’ for lack of inspiration. I could implement it using a base class, an interface or even a totally separate class as we’ll see in a moment. The calling code could be about adding a set of properties to a listing:

public class AddPropertiesToListingService {
  readonly IListingRepository _listingRepository;
  readonly IPropertyRepository _propertyRepository;

  public AddPropertiesToListingService(IListingRepository listingRepository, IPropertyRepository propertyRepository) {
    _listingRepository = listingRepository;
    _propertyRepository = propertyRepository;
  }

  public void AddPropertiesToListing(ListingId listingId, PropertyId[] propertyIds) {
    var listing = _listingRepository.Get(listingId);
    var properties = propertyIds.Select(propertyId => _propertyRepository.Get(propertyId));
    listing.AddProperties(properties);
  }
}

If this was modeled using either an interface or base class to denote the role, the repository code (***) would look a bit like this:

public interface IPropertyRepository {
  Property Get(Guid id);
}

public class PropertyRepository : IPropertyRepository {
  readonly IPropertyFactory _factory;
  readonly IEventStreamReader _reader;

  public PropertyRepository(IPropertyFactory factory, IEventStreamReader reader) {
    _factory = factory;
    _reader = reader;
  }

  public Property Get(Guid id) {
    var result = _reader.Read(id);
    if(!result.HasValue) {
      throw new PropertyNotFoundException(id);
    }
    var root = _factory.Create(result.Value);
    root.Initialize(result.Value.Events);
    return root;
  }
}

//This could also be a Func<EventStream, Property>
public interface IPropertyFactory {
  Property Create(EventStream eventStream);
}

public class EventStreamAnalyzingPropertyFactory : IPropertyFactory {
  readonly Dictionary<Type, Func<Property>> _propertyFactories;

  public EventStreamAnalyzingPropertyFactory() {
    _propertyFactories = new Dictionary<Type, Func<Property>>();
    //Assume that each of the aggregate root entities
    //has a static Factory method that creates a new instance.
    _propertyFactories.Add(typeof(ApartmentRegistered), () => Apartment.Factory());
    _propertyFactories.Add(typeof(VillaRegistered), () => Villa.Factory());
    _propertyFactories.Add(typeof(HouseRegistered), () => House.Factory());
    _propertyFactories.Add(typeof(WarehouseRegistered), () => Warehouse.Factory());
    //Remark: Yes, this is an OCP violation.
  }

  public Property Create(EventStream eventStream) {
    Func<Property> propertyFactory;
    if(!_propertyFactories.TryGet(eventStream[0].GetType(), out propertyFactory))
      throw new PropertyUnknownException(eventStream.Id);
    return propertyFactory();
  }
}

public interface IEventStreamReader {
  Optional<EventStream> Read(Guid id);
}

public interface Optional<T> {
  bool HasValue { get; }
  T Value { get; }
}

public interface EventStream {
  Guid Id { get; }
  Int32 ExpectedVersion { get; }
  //In a real world implementation this would be streaming,
  //i.e. IEnumerable<object>
  object[] Events { get; }
}

public class PropertyNotFoundException : Exception {
  public PropertyNotFoundException(Guid id) { }
}

public class PropertyUnknownException : Exception {
  public PropertyUnknownException(Guid id) { }
}

//As an aside, Property could also be 
//turned into an interface to describe
//the role behavior.
public abstract class Property : AggregateRootEntity { 
  /* Common behavior can be put here */
}

//Similar for the other property types.
public class Villa : Property {
  public static readonly Func<Villa> Factory = () => new Villa();

  Villa() { /* ... */}

  /* Specific behavior can be put here */
}

Notice how the factory is made responsible for analyzing the event stream and deciding which Property type to instantiate based on the type of the first event. It should be obvious that “the type of the first event” is just one of the ways you could come to decision of which Property type to instantiate.

A slightly different scenario is one where you load the stream into a dedicated class, instead of relying on a base class or an interface to fulfill the role.

public class Property {
  public static readonly Func<Property> Factory = () => new Property();

  Property() { 
    /* Streams from each of the Property types can 
       be loaded into this Role class  */
    Register<ApartmentRegistered>(When);
    Register<VillaRegistered>(When);
    Register<HouseRegistered>(When);
    Register<WarehouseRegistered>(When);
    /* Notice how I haven't even brought up what 
       you could do if these were polymorphic
       messages */
  } 

  /* Common role behavior goes here */
}

public class PropertyRepository : IPropertyRepository {
  readonly IEventStreamReader _reader;

  public PropertyRepository(IEventStreamReader reader) {
    _reader = reader;
  }

  public Property Get(Guid id) {
    var result = _reader.Read(id);
    if(!result.HasValue) {
      throw new PropertyNotFoundException(id);
    }
    var root = Property.Factory();
    root.Initialize(result.Value.Events);
    return root;
  }
}

This is exactly why you shouldn’t use the type name of an aggregate as a form of stream identification (at least if you want to support this kind of scenario). The repository will just load up the stream, being totally ignorant of what class was used to produce the events in the stream in the first place and happily feed it to the Property class.

Another scenario where this technique could prove to be useful is when your entity goes through a life cycle where each state has very different behavior or behavior needs to be limited as of a certain stage in its life cycle. Of course, this shouldn’t be used as an excuse to NOT model things explicitly.

Conclusion

The most important takeaway is that a stream of events does not need to be loaded into the same class all the time and that roles remain useful within a model backed by event sourcing. Like anything, this should be used with moderation and only if applicable.

(*) I’m reliving a scene with Arnold in Total Recall (1990) as I’m writing this (http://www.youtube.com/watch?feature=player_detailpage&v=WFMLGEHdIjE#t=86s).
(**) Having worked in the Realtor business, I can tell you right off the bat that having a class per real estate property type is going to hurt in the long run, but who am I to judge about the usefulness of this particular model.
(***) Don’t complain if the code doesn’t compile out of the box. I used my C# brain compiler.

Object Inheritance

When mentioning “object inheritance” most people immediately think of “class inheritance“. Alas, that’s not what it is. Quoting from Streamlined Object Modeling*:

Object inheritance allows two objects representing a single entity to be treated as one object by the rest of the system.

Put a different way, where class inheritance allows for code reuse, object inheritance allows for data reuse.

Application

Where could this be useful? Depending on the domain you are in, I’d say quite a few places. Whenever you replicate data from one object to another object, you should stop and consider if “object inheritance” could be applicable. While object inheritance does impose some constraints, it also saves you from writing reconciliation code to synchronize two or more objects (especially in systems that heavily rely on messaging).
A common example I like to use is the one of a video title and a video tape (**). From a real world point of view, a video tape has both the attributes of the tape itself and the title. Yet, if I modeled this as two separate objects, I run into trouble (a.k.a. synchronization woes). If I copy the title information to the tape upon creation of the tape, and I made an error in say the release date of the title, I now have to replicate that information to the “affected” tapes. Don’t get me wrong, sometimes this kind of behavior is desirable, i.e. it makes sense from a business perspective. But what if it doesn’t? That’s where “object inheritance” comes in. To a large extent, it can cover the scenarios that a synchronization based solution can, IF constraints are met.

Constraints

  • Localized data: “Object inheritance” assumes that the data of both objects lives close together. That might be a deal breaker for larger systems, or at least an indication that you’d have to consider co-locating the data of “both” objects.
  • Immutable data: One of the objects, the “parent”, its data is immutable during object inheritance. From an OOP perspective that means you can only invoke side-effect free functions on a “parent”.
  • “Parent” object responsibilities: the parent object contains information and behaviors that are valid across multiple contexts, multiple interactions, and multiple variations of an object.
  • “Child” object responsibilities: the child object represents the parent in a specialized context, in a particular interaction, or as a distinct variation.

The aforementioned book also states other constraints such as the child object exhibiting the parent object’s “profile” (think interface), but I find those less desirable in an environment that uses CQRS. For more in-depth information, I strongly suggest you read it. It has a wealth of information on this very topic.

Code

In its simplest form “object inheritance” looks and feels like object composition.

The composition can be hidden from child object consuming code behind a repository’s facade as shown below.

The gist of “object inheritance” is that the child object (the video tape) asks the parent (the video) for data or invokes side-effect free functions on the parent to accomplish its task.

Lunch -> Not Free

Whether you go down the route of “object inheritance” or (message based) “synchronization”, you will have to think about object life-cycles (both parent and child). Sometimes it’s desirable for the child to have a still view of the parent, a snapshot if you will. Other times you may want the child to always see the parent in its most current form. Children can even change parent during their life-cycle. Other children may want a more “controlled” view of the parent, rolling forward or backward based on their needs or context. You can get pretty sophisticated with this technique, especially in an event sourced domain model since it’s very well suited to roll up to a certain point in time or a certain revision of a parent. In the same breath, I should also say that you can very well shoot yourself in the foot with this technique, especially if the composition is to be taking place in the user interface, in which case there’s no point in using this. It’s also very easy to bury yourself in the pit of abstraction when talking about “child” and “parent”, so do replace those words with nouns from your domain, and get your archetypes right.

All in all, I’ve found this a good tool in the box, that plays nicely with aggregates, event-sourcing, even state-based models that track the concept of time. It’s not something I use everyday, but whenever I did, I ended up with less code. YMMV.

(*) Streamlined object modeling is a successor to Peter Coad’s “Modeling In Color“. A Kindle version of the former is available.
(**) I’m from the “cassette & video” ages.

The Money Box

Once in a while I hear/read people struggle with projection performance. The root causes of their performance issues are diverse:

  • using the same persistence behavior during projection rebuild as during live/production build
  • use of ill fit technology (here’s looking at you EF) or usage of said technology in the wrong way
  • too wide a persistence interface which makes it difficult to optimize for performance
  • no batching support or batching as an afterthought
  • not thinking about the implication of performing reads during projection

Ever heard of the book Refactoring to Patterns? It has a nice refactoring in there, called Move Accumulation to Collecting Parameter that refers to the Collecting Parameter pattern on C2. How would this help with thy projections? Well, what if you could decouple the act of performing an action from collecting what is required to be able to perform an action? Put another way, what if you decouple the act of executing SQL DML statements from collecting those statements during projection (a.k.a. event handling)? So, instead of …

… we add another level of indirection …

The most noticeable differences are the decoupling from persistence technology(*), no reads, and no promises with regard to when the requested operations will be executed/flushed to storage. Usually, the IProjectionSqlOperations interface will have a very small surface (i.e. low member count), covering INSERT, UPDATE and DELETE. During live/production projection building you could have an implementation of this interface (a.k.a. strategy) that flushes as soon as an operation is requested.

However, the more interesting implementations are the ones that are used during rebuild.

This implementation translates the requested operations into sql statement objects (abstracted by ISqlStatement) and pushes them onto something that observes these sql statements. The observer couldn’t care less what the actual sql statements are (that happened in the projection handler above). The simplest observer implementation could look something like this …

Of course, collecting in and by itself is not all that useful. You have to do something with what you’ve collected (“the money in the box”). Let’s look at another observer that takes a slightly different approach.

Without diving too much into the details, this observer flushes statements to the database as soon as a hard-coded threshold is reached. It does so in a batch-like fashion to minimize the number of roundtrips, but still adhering to the limitations that come with this particular ADO.NET data provider. Other implementations use SqlBulkCopy to maximize performance (but come with their own limitations). Depending on what memory resources a server has you could get pretty creative as to which strategy you choose to rebuild a large projection.

Conclusion

I’ve shown you SQL centric projections, but please, do step out of the box. Nothing is stopping you from producing and collecting “HttpStatements” for your favorite key-value store or “FileOperations” for your file-based projections. Nothing is stopping you from making different choices for the producing and consuming side. Nothing is stopping you from doing the “statement” execution in an asynchronous and/or parallel fashion. It’s just a matter of exploring your options and use what works in your environment. Next time I’ll show you how reading fits into all this …

(*): Yeah, yeah, I know, too much abstraction kills kittens …

Follow

Get every new post delivered to your Inbox.

Join 1,079 other followers