This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Concepts

The essential concepts of Dolittle

The Concepts section helps you learn about the abstractions and components of Dolittle.

To learn how to write a Dolittle application read our tutorial.

1 - Overview

Get a high-level outline of Dolittle and it’s components

Dolittle is a decentralized, distributed, event-driven microservice platform built to harness the power of events. It’s a reliable ecosystem for microservices to thrive so that you can build complex applications with small, focused microservices that are loosely coupled, event driven and highly maintainable.

Components

  • Events are “facts that have happened” in your system and they form the truth of the system.
  • Event Handlers & Filter and Projections process events.
  • The Runtime is the core of all Dolittle applications and manages connections from the SDKs and other Runtimes to its Event Store. The Runtime is packaged as a Docker image
  • The Event Store is the underlying database where the events are stored.
  • The Head is the user code that uses the SDKs, which connect to the Runtime in the same way as a client (SDK) connects to a server (runtime).
  • A Microservice is one or more Heads talking to a Runtime.
  • Microservices can produce and consume events between each other over the Event Horizon.

Event-Driven

Dolittle uses a style of Event-Driven Architecture called Event Sourcing, which means to “capture all changes to an applications state as a sequence of events”, these events then form the “truth” of the system. Events cannot be changed or deleted as they represent things that have happened.

With event sourcing your applications state is no longer stored as a snapshot of your current state but rather as a whole history of all state-changing events. These events can then be replayed to recreate the state whenever needed, eg. replay them to a test environment to see how it would behave. The system can also produce the state it had at any point in time.

Event sourcing allows for high scalability thanks to being a very loosely coupled system, eg. a stream of events can keep a set of in-memory databases updated instead of having to query a master database.

The history of events also forms an audit log to help with debugging and auditing.

Distributed & Decentralized

Dolittle applications are built from microservices that communicate with each other using events. These microservices can scale and fail independently as there is no centralized message bus like in Kafka. The Runtimes and event stores are independent of other parts of the system.

Microservice

A microservice consists of one or many heads talking to one Runtime. Each microservice is autonomous and has its own resources and event store.

The core idea is that a microservice is an independently scalable unit of deployment that can be reused in other parts of the software however you like. You could compose it back in one application running inside a single process, or you could spread it across a cluster. It really is a deployment choice once the software is giving you this freedom.

This diagram shows the anatomy of a microservice with one head.

Example anatomy of a Dolittle microservice

Multi-tenancy

Since computing is the most expensive resource, the Dolittle Runtime and SDK’s has been built from the ground up with multi-tenancy in mind. Multi-tenancy means that a single instance of the software and its supporting infrastructure serves multiple customers, making optimal use of resources. Dolittle supports multi-tenancy by separating the event stores for each tenant so that each tenant only has access to its own data.

This diagram shows a microservice with 2 tenants, each of them with their own resources.

Example of multi-tenant microservice

What Dolittle isn’t

Dolittle is not a traditional backend library nor an event driven message bus like Kafka. Dolittle uses Event Sourcing, which means that the state of the system is built from an append-only Event Store that has all the events ever produced by the application.

Dolittle does not provide a solution for read models/cache. Different situations call for different databases depending on the sort of load and data to be stored. The event store only defines how the events are written in the system, it doesn’t define how things are read or interpreted.

Dolittle isn’t a CQRS framework, but it used to be.

Technology

The Event Store is implemented with MongoDB.

What’s next

2 - Events

The source of truth in the system

An Event is a serializable representation of “a fact that has happened within your system”.

“A fact”

An event is a change (fact) within our system. The event itself contains all the relevant information concerning the change. At its simplest, an event can be represented by a name (type) if it’s enough to describe the change.

More usually, it is a simple Data Transfer Object (DTO) that contains state and properties that describe the change. It does not contain any calculations or behavior.

“that has happened”

As the event has happened, it cannot be changed, rejected, or deleted. This forms the basis of Event Sourcing If you wish to change the action or the state change that the event encapsulates, then it is necessary to initiate an action that results in another event that nullifies the impact of the first event.

This is common in accounting, for example: Sally adds 100$ into her bank, which would result in an event like “Add 100$ to Sally’s account”. But if the bank accidentally adds 1000$ instead of the 100$ then a correcting event should be played, like “Subtract 900$ from Sally’s account”. And with event sourcing, this information is preserved in the event store for eg. later auditing purposes.

Naming

To indicate that the event “has happened in the past”, it should be named as a verb in the past tense. Often it can contain the name of the entity that the change or action is affecting.

  • DishPrepared
  • ItemAddedToCart
  • StartCooking
  • AddItemToCart

“within your system”

An event represents something interesting that you wish to capture in your system. Instead of seeing state changes and actions as side effects, they are explicitly modeled within the system and captured within the name, state and shape of our Event.

State transitions are an important part of our problem space and should be modeled within our domain — Greg Young

Naming

An event should be expressed in language that makes sense in the domain, also known as Ubiquitous Language. You should avoid overly technical/CRUD-like events where such terms are not used in the domain.

For example, in the domain of opening up the kitchen for the day and adding a new item to the menu:

  • KitchenOpened
  • DishAddedToMenu
  • TakeoutServerReady
  • MenuListingElementUpdated

Main structure of an Event

This is a simplified structure of the main parts of an event. For the Runtime, the event is only a JSON-string which is saved into the Event Store.

Event {
    Content object
    EventLogSequenceNumber int
    EventSourceId string
    Public bool
    EventType {
        EventTypeId Guid
        Generation int
    }
}

For the whole structure of an event as defined in protobuf, please check Contracts.

Content

This is the content of the to be committed. It needs to be serializable to JSON.

EventLogSequenceNumber

This is the events position in the Event Log. It uniquely identifies the event.

EventSourceId

EventSourceId represents the source of the event like a “primary key” in a traditional database. The value of the event source id is simply a string, and we don’t enforce any particular rules or restrictions on the event source id. By default, partitioned event handlers use it for partitioning.

Public vs. Private

There is a basic distinction between private events and public events. In much the same way that you would not grant access to other applications to your internal database, you do not allow other applications to receive any of your private events.

Private events are only accessible within a single Tenant so that an event committed for one tenant cannot be handled outside of that tenant.

Public events are also accessible within a single tenant but they can also be added to a public Stream through a public filterfor other microservices to consume. Your public event streams essentially form a public API for the other microservices to subscribe to.

EventType

An EventType is the combination of an EventTypeId to uniquely identify the type of event it is and the event type’s Generation. This decouples the event from a programming language and enables the renaming of events as the domain language evolves.

For the Runtime, the event is just a JSON-string. It doesn’t know about the event’s content, properties, or type (in its respective programming language). The Runtime saves the event to the event log and from that point the event is ready to be processed by the EventHandlers & Filters. For this event to be serialized to JSON and then deserialized back to a type that the client’s filters and event handlers understand, an event type is required.

This diagram shows us a simplified view of committing a single event with the type of DishPrepared. The Runtime receives the event, and sends it back to us to be handled. Without the event type, the SDK wouldn’t know how to deserialize the JSON message coming from the Runtime.

Flow of committing an event type

Event types are also important when wanting to deserialize events coming from other microservices. As the other microservice could be written in a completely different programming language, event types provide a level of abstraction for deserializing the events.

Generations

Generations are still under development. At the moment they are best to be left alone.

As the code changes, the structures and contents of your events are also bound to change at some point. In most scenarios, you will see that you need to add more information to events. These iterations on the same event type are called generations. Whenever you add or change a property in an event, the generation should be incremented to reflect that it’s a new version of the event. This way the filters and handlers can handle different generations of an event.

3 - Streams

Get an overview of Event Streams

So, what is a stream? A stream is simply a list with two specific attributes:

  • Streams are append-only. Meaning that items can only be put at the very end of the stream, and that the stream is not of a fixed length.
  • Items in the stream immutable. The items or their order cannot change. An event stream is simply a stream of events. Each stream is uniquely identified within an Event Store by a GUID. An event can belong many streams, and in most cases it will at least belong to two streams (one being the event log).

As streams are append-only, an event can be uniquely identified by its position in a stream, including in the event log.

Event streams are perhaps the most important part of the Dolittle platform. To get a different and more detailed perspective on streams, please read our section on event sourcing and streams.

Rules

There are rules on streams to maintain idempotency and the predictability of Runtime. These rules are enforced by the Runtime:

  • The ordering of the events cannot change
  • Events can only be appended to the end of the stream
  • Events cannot be removed from the stream
  • A partitioned stream cannot be changed to be unpartitioned and vice versa

Partitions

If we dive deeper into event streams we’ll see that we have two types of streams in the Runtime; partitioned and unpartitioned streams.

A partitioned stream is a stream that is split into chunks. These chunks are uniquely identified by a PartitionId (string). Each item in a partitioned stream can only belong to a single partition.

An unpartitioned stream only has one chunk with a PartitionId of 00000000-0000-0000-0000-000000000000.

There are multiple reasons for partitioning streams. One of the benefits is that it gives a way for the developers to partition their events and the way they are processed in an Event Handler. Another reason for having partitions becomes apparent when needing to subscribe to other streams in other microservices. We’ll talk more about that in the Event Horizon section.

Public vs Private Streams

There are two different types of event streams; public and private. Private streams are exposed within their Tenant and public streams are additionally exposed to other microservices. Through the Event Horizon other microservices can subscribe to your public streams. Using a public filter you can filter out public events to public streams.

Stream Processor

A stream processor consists of an event stream and an event processor. It takes in a stream of events, calls the event processor to process the events in order, keeps track of which events have already been processed, which have failed and when to retry. Each stream processor can be seen as the lowest level unit-of-work in regards to streams and they all run at the same time, side by side, in parallel.

Since the streams are also uniquely identified by a stream id we can identify each stream processor by their SourceStream, EventProcessor pairing.

// structure of a StreamProcessor
StreamProcessor {
    SourceStream Guid
    EventProcessor Guid
    // the next event to be processed
    Position int
    // for keeping track of failures and retry attempts
    LastSuccesfullyProcessed DateTime
    RetryTime DateTime
    FailureReason string
    ProcessingAttempts int
    IsFailing bool
}

The stream processors play a central role in the Runtime. They enforce the most important rules of Event Sourcing; an event in a stream is not processed twice (unless the stream is being replayed) and that no event in a stream is skipped while processing.

Stream processors are constructs that are internal to the Runtime and there is no way for the SDK to directly interact with stream processors.

Dealing with failures

What should happen when a processor fails? We cannot skip faulty events, which means that the event processor has to halt until we can successfully process the event. This problem can be mitigated with a partitioned stream because the processing only stops for that single partition. This way we can keep processing the event stream even though one, or several, of the partitions fail. The stream processor will at some point retry processing the failing partitions and continue normally if it succeeds.

Event Processors

There are 2 different types of event processors:

  • Filters that can create new streams
  • Processors that process the event in the user’s code

These are defined by the user with Event Handlers & Filters.

When the processing of an event is completed it returns a processing result back to the stream processor. This result contains information on whether or not the processing succeeded or not. If it did not succeed it will say how many times it has attempted to process that event, whether or not it should retry and how long it will wait until retrying.

Multi-tenancy

When registering processors they are registered for every tenant in the Runtime, resulting in every tenant having their own copy of the stream processor.

Formula for calculating the total number of stream processors created:

(((2 x event handlers) + filters) x tenants)  + event horizon subscriptions = stream processors

Let’s provide an example:

For both the filter and the event processor “processors” only one stream processor is needed. But for event handlers we need two because it consists of both a filter and an event processor. If the Runtime has 10 tenants and the head has registered 20 event handlers we’d end up with a total of 20 x 2 x 10 = 400 stream processors.

4 - Event Handlers & Filters

Overview of event handlers and filters

In event-sourced systems it is usually not enough to just say that an Event occurred. You’d expect that something should happen as a result of that event occurring as well.

In the Runtime we can register 2 different processors that can process events; Event Handlers and Filters. They take in a Stream of events as an input and does something to each individual event.

Each of these processors is a combination of one or more Stream Processors and Event Processor. What it does to the event is dependent on what kind of processor it is. We’ll talk more about different processors later in this section.

Registration

In order to be able to deal with committed events, the heads needs to register their processors. The Runtime offers endpoints which initiates the registration of the different processors. Only registered processors will be ran. When the head disconnects from the Runtime all of the registered processors will be automatically unregistered and when it re-connects it will re-register them again. Processors that have been unregistered are idle in the Runtime until they are re-registered again.

Scope

Each processor processes events within a single scope. If not specified, they process events from the default scope. Events coming over the Event Horizon are saved to a scope defined by the event horizon Subscription.

Filters

The filter is a processor that creates a new stream of events from the event log. It is identified by a FilterId and it can create either a partitioned or unpartitioned stream. The processing in the filter itself is however not partitioned since it can only operate on the event log stream which is an unpartitioned stream.

Filter

The filter is a powerful tool because it can create an entirely customized stream of events. It is up to the developer on how to filter the events, during filtering both the content and the metadata of the event is available for the filter to consider. If the filter creates a partitioned stream it also needs to include which partition the event belongs to.

However with great power comes great responsibility. The filters cannot be changed in a way so that it breaks the rules of streams. If it does, the Runtime would notice it and return a failed registration response to the head that tried to register the filter.

Public Filters

Since there are two types of streams there are two kinds of filters; public and private. They function in the same way, except that private filters creates private streams and a public filter creates public streams. Only public events can be filtered into a public stream.

Event Handlers

The event handler is a combination of a filter and an event processor. It is identified by an EventHandlerId which will be both the id of both the filter and the event processor.

Event Handler

The event handler’s filter is filtering events based on the EventType that the event handler handles.

Event handlers can be either partitioned or unpartitioned. Partitioned event handlers uses, by default, the EventSourceId of each event as the partition id. The filter follows the same rules for streams as other filters.

Changes to event handlers

As event handlers create a stream based on the types of events they handles, they have to uphold the rules of streams. Every time an event handler is registered the Runtime will check that these rules are upheld and that the event handlers definition wouldn’t invalidate the already existing stream. Most common ways of breaking the rules are:

  • The event handler stops handling an event type that it has already handled. This would mean that events would have to be removed from the stream, breaking the append-only rule.

Event Handler creates an invalid stream by removing an already handled event type

  • The event handler starts handling a new event type that has already occurred in the event log. This would mean changing the ordering of events in the streams and break the append-only rule.

Event Handler creates an invalid stream by adding a new event type

It is possible to add a new type of event into the handler if it doesn’t invalidate the stream. For example, you can add a new event type to the handler if it hasn’t ever been committed before any of the other types of events into the event log.

Replaying events

An event handler is meant to handle each events only once, however if you for some reason need to “replay” or “re-handle” all or some of the events for an event handler, you can use the Dolittle CLI to initiate this while the microservice is running.

The replay does not allow you to change what event types the event handler handles. To do this, you need to change the event handlers EventHandlerId. This registers a completely new event handler with the Runtime, and a completely new stream is created. This way no old streams are invalidated.

If you want to have an event handler for read models which replays all of its events whenever it changes, try using Projections instead, as they are designed to allow frequent changes.

Multi-tenancy

When registering processors they are registered for every tenant in the Runtime, resulting in every tenant having their own copy of the Stream Processor.

5 - Projections

Overview of projections

A Projection is a special type of Event Handler, that only deals with updating or deleting Read Models based on Events that it handles. The read model instances are managed by the Runtime in a read model store, where they are fetched from whenever needed. This is useful, for when you want to create views from events, but don’t want to manually manage the read model database.

Read models defines the data views that you are interested in presenting, while a projection specifies how to compute this view from the event store. There is a one-to-one relationship between a projection and their corresponding read model. A projection can produce multiple instances of that read model and it will assign each of them a unique key. This key is based on the projections key selectors.

Example of a projection:

Diagram of projections

Read model

A read model represents a view into the data in your system, and are used when you want to show data or build a view. It’s essentially a Data transfer object (DTO) specialized for reading. They are computed from the events, and are as such read-only object without any behaviour seen from the user interface. Some also refer to read models as materialized views.

As read models are computed objects, you can make as many as you want based on whatever events you would like. We encourage you to make every read model single purpose and specialized for a particular use. By splitting up or combining data so that a read model matches exactly what an end-user sees on a single page, you’ll be able to iterate on these views without having to worry how it will affect other pages.

On the other hand, if you end up having to fetch more than one read model to get the necessary data for a single page, you should consider combining those read models.

The read models are purely computed values, which you are free to throw them away or recreate lost ones at any point in time without loosing any data.

The Runtime stores the read models into a read model store, which is defined in the resources.json. Each read model gets its own unique key, which is defined by the projections key selector.

Projection

A projections purpose is to populate the data structure (read model) with information from the event store. Projections behave mostly like an event handler, but they don’t produce a Stream from the events that it handles. This means that changing a projection (like adding or removing handle methods from it) will always make it replay and recalculate the read models from the start of the Event Log. This makes it easier to iterate and develop these read models.

This is a simplified structure of a projection:

Projection {
    ProjectionId Guid
    Scope Guid
    ReadModel type
    EventTypes EventType[]
}

For the whole structure of a projections as defined in protobuf, please check Contracts.

Key selector

Each read model instance has a key, which uniquely identifies it within a projection. A projection handles multiple instances of its read models by fetching the read model with the correct key. It will then apply the changes of the on methods to that read model instance.

The projection fetches the correct read model instance by specifying the key selector for each on method. There are 3 different key selector:

  • Event source based key selector, which defines the read model instances key as the events EventSourceId.
  • Event property based key selector, which defines the key as the handled events property.
  • Partition based key selector, which defines the key as the events streams PartitionId.

6 - Tenants

What is a Tenant & Multi-tenancy

Dolittle supports having multiple tenants using the same software out of the box.

What is a Tenant?

A Tenant is a single client that’s using the hosted software and infrastructure. In a SaaS (Software-as-a-Service) domain, a tenant would usually be a single customer using the service. The tenant has its privileges and resources only it has access to.

What is Multi-tenancy?

In a multi-tenant application, the same instance of the software is used to serve multiple tenants. An example of this would be an e-commerce SaaS. The same basic codebase is used by multiple different customers, each who has their own customers and their own data.

Multi-tenancy allows for easier scaling, sharing of infrastructure resources, and easier maintenance and updates to the software.

Simple explanation of multi tenancy

Multi-tenancy in Dolittle

In Dolittle, every tenant in a Microservice is identified by a GUID. Each tenant has their own Event Store, managed by the Runtime. These event stores are defined in the Runtime configuration files. The tenants all share the same Runtime, which is why you need to specify the tenant which to connect to when using the SDKs.

7 - Event Horizon

Learn about Event Horizon, Subscriptions, Consumers and Producers

At the heart of the Dolittle runtime sits the concept of Event Horizon. Event horizon is the mechanism for a microservice to give Consent for another microservice to Subscribe to its Public Stream and receive Public Events.

Anatomy of an Event Horizon subscription

Producer

The producer is a Tenant in a Microservice that has one or more public streams that Consumer can subscribe to. Only public events are eligible for being filtered into a public stream.

Once an event moves past the event horizon, the producer will no longer see it. The producer doesn’t know or care, what happens with an event after it has gone past the event horizon.

The producer has to give consent for a consumer to subscribe to a Partition in the producers public stream. Consents are defined in event-horizon-consents.json.

Consumer

A consumer is a tenant that subscribes to a partition in one of the Producer’s public streams. The events coming from the producer will be stored into a Scoped Event Log in the consumer’s event store. This way even if the producer would get removed or deprecated, the produced events are still saved in the consumer. To process events from a scoped event log you need scoped event handlers & filters.

The consumer sets up the subscription and will keep asking the producer for events. The producers Runtime will check whether it has a consent for that specific subscription and will only allow events to flow if that consent exists. If the producer goes offline or doesn’t consent, the consumer will keep retrying.

Subscription

A subscription is setup by the consumer to receive events from a producer. Additionally the consumer has to add the producer to its microservices.json.

This is a simplified structure of a Subscription in the consumer.

Subscription {
    // the producers microservice, tenant, public stream and partition
    MicroserviceId Guid
    TenantId Guid
    PublicStreamId Guid
    PartitionId string
    // the consumers scoped event log 
    ScopeId Guid
}

Event migration

We’re working on a solution for event migration strategies using Generations. As of now there is no mechanism for dealing with generations, so they are best left alone. Extra caution should be paid to changing public events so as not to break other microservices consuming those events.

8 - Event Store

Introduction to the Event Store

An Event Store is a database optimized for storing Events in an Event Sourced system. The Runtime manages the connections and structure of the stored data. All Streams, Event Handlers & Filters, Aggregates and Event Horizon Subscriptions are being kept track inside the event store.

Events saved to the event store cannot be changed or deleted. It acts as the record of all events that have happened in the system from the beginning of time.

Each Tenant has their own event store database, which is configured in resources.json.

Scope

Events that came over the Event Horizon need to be put into a scoped collection so they won’t be mixed with the other events from the system.

Scoped collections work the same way as other collections, except you can’t have Public Streams or Aggregates.

Structure of the Event Store

This is the structure of the event store implemented in MongoDB. It includes the following collections in the default Scope:

  • event-log
  • aggregates
  • stream-processor-states
  • stream-definitions
  • stream-<streamID>
  • public-stream-<streamID>

For scoped collections:

Following JSON structure examples have each property’s BSON type as the value.

event-log

The Event Log includes all the Events committed to the event store in chronological order. All streams are derived from the event log.

Aggregate events have "wasAppliedByAggregate": true set and events coming over the Event Horizon have "FromEventHorizon": true" set.

This is the structure of a committed event:

{
    // this it the events EventLogSequenceNumber,
    // which identifies the event uniquely within the event log
    "_id": "decimal",
    "Content": "object",
    // Aggregate metadata
    "Aggregate": {
        "wasAppliedByAggregate": "bool",
        // AggregateRootId
        "TypeId": "UUID",
        // AggregateRoot Version
        "TypeGeneration": "long",
        "Version": "decimal"
    },
    // EventHorizon metadata
    "EventHorizon": {
        "FromEventHorizon": "bool",
        "ExternalEventLogSequenceNumber": "decimal",
        "Received": "date",
        "Concent": "UUID"
    },
    // the committing microservices metadata
    "ExecutionContext": {
        // 
        "Correlation": "UUID",
        "Microservice": "UUID",
        "Tenant": "UUID",
        "Version": "object",
        "Environment": "string",
    },
    // the events metadata
    "Metadata": {
        "Occurred": "date",
        "EventSource": "string",
        // EventTypeId and Generation
        "TypeId": "UUID",
        "TypeGeneration": "long",
        "Public": "bool"
    }
}

aggregates

This collection keeps track of all instances of Aggregates registered with the Runtime.

{
    "EventSource": "string",
    // the AggregateRootId
    "AggregateType": "UUID",
    "Version": "decimal"
}

stream

A Stream contains all the events filtered into it. It’s structure is the same as the event-log, with the extra Partition property used for partitions

The streams StreamId is added to the collections name, eg. a stream with the id of 323bcdb2-5bbd-4f13-a7c3-b19bc2cc2452 would be in a collection called stream-323bcdb2-5bbd-4f13-a7c3-b19bc2cc2452.

{
    // same as an Event in the "event-log" + Partition
    "Partition": "string",
}

public-stream

The same as a stream, except only for Public Stream with the public prefix in collection name. Public streams can only exist on the default scope.

stream-definitions

This collection contains all Filters registered with the Runtime.

Filters defined by an Event Handler have a type of EventTypeId, while other filters have a type of Remote.

{
    // id of the Stream the Filter creates
    "_id": "UUID",
    "Partitioned": "bool",
    "Public": "bool",
    "Filter": {
        "Type": "string",
        "Types": [
            // EventTypeIds to filter into the stream
        ]
    }
}

stream-processor-states

This collection keeps track of all Stream Processors Event Processors and their state. Each event processor can be either a Filter on an Event Processor that handles the events from an event handler.

Filter:

{
    "SourceStream": "UUID",
    "EventProcessor": "UUID",
    "Position": "decimal",
    "LastSuccesfullyProcessed": "date",
    // failure tracking information
    "RetryTime": "date",
    "FailureReason": "string",
    "ProcessingAttempts": "int",
    "IsFailing": "bool
}

Event Processor:

Partitioned streams will have a FailingPartitions property for tracking the failing information per partition. It will be empty if there are no failing partitions. The partitions id is the same as the failing events EventSourceId. As each partition can fail independently, the "Position" value can be different for the stream processor at large compared to the failing partitions "position".

{
    "Partitioned": true,
    "SourceStream": "UUID",
    "EventProcessor": "UUID",
    "Position": "decimal",
    "LastSuccessfullyProcessed": "date",
    "FailingPartitions": {
        // for each failing partition
        "<partition-id>": {
            // the position of the failing event in the stream
            "Position": "decimal",
            "RetryTime": "date",
            "Reason": "string",
            "ProcessingAttempts": "int",
            "LastFailed": "date"
        }
    }
}

subscription-states

This collection keeps track of Event Horizon Subscriptions in a very similar way to stream-processor-states.

{
    // producers microservice, tenant and stream info
    "Microservice": "UUID",
    "Tenant": "UUID",
    "Stream": "UUID",
    "Partition": "string",
    "Position": "decimal",
    "LastSuccesfullyProcessed": "date",
    "RetryTime": "date",
    "FailureReason": "string",
    "ProcessingAttempts": "int",
    "IsFailing": "bool
}

Commit vs Publish

We use the word Commit rather than Publish when talking about saving events to the event store. We want to emphasize that it’s the event store that is the source of truth in the system. The act of calling filters/event handlers comes after the event has been committed to the event store. We also don’t publish to any specific stream, event handler or microservice. After the event has been committed, it’s ready to be picked up by any processor that listens to that type of event.

9 - Event Sourcing

Overview of Event Sourcing in the Dolittle Platform

Event Sourcing is an approach that derives the current state of an application from the sequential Events that have happened within the application. These events are stored to an append-only Event Store that acts as a record for all state changes in the system.

Events are facts and Event Sourcing is based on the incremental accretion of knowledge about our application / domain. Events in the log cannot be changed or deleted. They represent things that have happened. Thus, in the absence of a time machine, they cannot be made to un-happen.

Here’s an overview of Event Sourcing:

Basic anatomy of event sourcing

Problem

A traditional model of dealing with data in applications is CRUD (create, read, update, delete). A typical example is to read data from the database, modify it, and update the current state of the data. Simple enough, but it has some limitations:

  • Data operations are done directly against a central database, which can slow down performance and limit scalability
  • Same piece of data is often accessed from multiple sources at the same time. To avoid conflicts, transactions and locks are needed
  • Without additional auditing logs, the history of operations is lost. More importantly, the reason for changes is lost.

Advantages with Event Sourcing

  • Horizontal scalability
    • With an event store, it’s easy to separate change handling and state querying, allowing for easier horizontal scaling. The events and their projections can be scaled independently of each other.
    • Event producers and consumers are decoupled and can be scaled independently.
  • Flexibility
    • The Event Handlers react to events committed to the event store. The handlers know about the event and its data, but they don’t know or care what caused the event. This provides great flexibility and can be easily extended/integrated with other systems.
  • Replayable state
    • The state of the application can be recreated by just re-applying the events. This enables rollbacks to any previous point in time.
    • Temporal queries make it possible to determine the state of the application/entity at any point in time.
  • Events are natural
  • Audit log
    • The whole history of changes is recorded in an append-only store for later auditing.
    • Instead of being a simple record of reads/writes, the reason for change is saved within the events.

Problems with Event Sourcing

  • Eventual consistency
    • As the events are separated from the projections made from them, there will be some delay between committing an event and handling it in handlers and consumers.
  • Event store is append-only
    • As the event store is append-only, the only way to update an entity is to create a compensating event.
    • Changing the structure of events is hard as the old events still exist in the store and need to also be handled.

Projections

The Event Store defines how the events are written in the system, it does not define or prescribe how things are read or interpreted. Committed events will be made available to any potential subscribers, which can process the events in any way they require. One common scenario is to update a read model/cache of one or multiple views, also known as a projections or materialized view. As the Event Store is not ideal for querying data, a prepopulated view that reacts to changes is used instead. Dolittle has no built-in support for a specific style of projection as the requirements for that are out of scope of the platform.

Compensating events

To negate the effect of an Event that has happened, another Event has to occur that reverses the effect. This can be seen in any mature Accounting domain where the Ledger is an immutable event store or journal. Entries in the ledger cannot be changed. The current balance can be derived at any point by accumulating all the changes (entries) that have been made and summing them up (credits and debts). In the case of mistakes, an explicit correcting action would be made to fix the ledger.

Commit vs Publish

Dolittle doesn’t publish events, rather they are committed. Events are committed to the event log, from which any potential subscribers will pick up the event from and process it. There is no way to “publish” to a particular subscriber as all the events are available on the event log, but you can create a Filter that creates a Stream.

Reason for change

By capturing all changes in the forms of events and modeling the why of the change (in the form of the event itself), an Event Sourced system keeps as much information as possible.

A common example is of a e-shopping that wants to test a theory:

A user who has an item in their shopping cart but does not proceed to buy it will be more likely to buy this item in the future

In a traditional CRUD system, where only the state of the shopping cart (or worse, completed orders) is captured, this hypothesis is hard to test. We do not have any knowledge that an item was added to the cart, then removed.

On the other hand, in an Event Sourced system where we have events like ItemAddedToCart and ItemRemovedFromCart, we can look back in time and check exactly how many people had an item in their cart at some point and did not buy it, subsequently did. This requires no change to the production system and no time to wait to gather sufficient data.

When creating an Event Sourced system we should not assume that we know the business value of all the data that the system generates, or that we always make well-informed decisions for what data to keep and what to discard.

Further reading

10 - Aggregates

Overview of Aggregates

An Aggregate is Domain-driven design (DDD) term coined by Eric Evans. An aggregate is a collection of objects and it represents a concept in your domain, it’s not a container for items. It’s bound together by an Aggregate Root, which upholds the rules (invariants) to keep the aggregate consistent. It encapsulates the domain objects, enforces business rules, and ensures that the aggregate can’t be put into an invalid state.

Example

For example, in the domain of a restaurant, a Kitchen could be an aggregate, where it has domain objects like Chefs, Inventory and Menu and an operation PrepareDish.

The kitchen would make sure that:

  • A Dish has to be on the Menu for it to be ordered
  • The Inventory needs to have enough ingredients to make the Dish
  • The Dish gets assigned to an available Chef

Here’s a simple C#ish example of what this aggregate root could look like:

public class Kitchen
{
    Chefs _chefs;
    Inventory _inventory;
    Menu _menu;

    public void PrepareDish(Dish dish)
    {
        if (!_menu.Contains(dish))
        {
            throw new DishNotOnMenu(dish);
        }
        foreach (var ingredient in dish.ingredients)
        {
            var foundIngredient = _inventory
                .GetIngredient(ingredient.Name);
            if (!foundIngredient)
            {
                throw new IngredientNotInInventory(ingredient);
            }

            if (foundIngredient.Amount < ingredient.Amount)
            {
                throw new InventoryOutOfIngredient(foundIngredient);
            }
        }
        var availableChef = _chefs.GetAvailableChef();
        if (!availableChef)
        {
            throw new NoAvailableChefs();
        }
        availableChef.IsAvailable = false;
    }
}

Aggregates in Dolittle

With Event Sourcing the aggregates are the key components to enforcing the business rules and the state of domain objects. Dolittle has a concept called AggregateRoot in the Event Store that acts as an aggregate root to the AggregateEvents applied to it. The root holds a reference to all the aggregate events applied to it and it can fetch all of them.

Structure of an AggregateRoot

This is a simplified structure of the main parts of an aggregate root.

AggregateRoot {
    AggregateRootId Guid
    EventSourceId string
    Version int
    AggregateEvents AggregateEvent[] {
        EventSourceId Guid
        AggregateRootId Guid
        // normal Event properties also included
        ...
    }
}
AggregateRootId

Identifies this specific type of aggregate root. In the kitchen example this would a unique id given to the Kitchen class to identify it from other aggregate roots.

EventSourceId

EventSourceId represents the source of the event like a “primary key” in a traditional database. In the kitchen example this would be the unique identifier for each instance of the Kitchen aggregate root.

Version

Version is the position of the next AggregateEvent to be processed. It’s incremented after each AggregateEvent has been applied by the AggregateRoot. This ensures that the root will always apply the events in the correct order.

AggregateEvents

The list holds the reference ids to the actual AggregateEvent instances that are stored in the Event Log. With this list the root can ask the Runtime to fetch all of the events with matching EventSourceId and AggregateRootId.

Designing aggregates

When building your aggregates, roots and rules, it is helpful to ask yourself these questions:

  • “What is the impact of breaking this rule?"
  • “What happens in the domain if this rule is broken?"
  • “Am I modelling a domain concern or a technical concern?"
  • “Can this rule be broken for a moment or does it need to be enforced immediately?"
  • “Do these rules and domain objects break together or can they be split into another aggregate?"

Further reading