Most of the applications I've written in my career have involved some sort of database for persisted application state. Until recently, that's usually meant using a relational database engine (RDBMS), like SQL Server, paired with some sort of object relational mapping (ORM) tool, like Entity Framework Core or Dapper. More recently, though, it seems that many teams are switching to lower ceremony “NoSQL” document databases.

Here's an important definition: A document-oriented database, or document store, is a computer program and data storage system designed for storing, retrieving, and managing document-oriented information, also known as semi-structured data.

Enter the open-source Marten library (https://martendb.io) that allows .NET developers to use the rock-solid Postgresql database engine as a document database and event store. The other authors of Marten and I chose Postgresql specifically because of its unique JSONB storage type, where raw JSON data is stored in an efficient binary representation (see https://www.postgresql.org/docs/current/datatype-json.html for more information). From the .NET side, Marten leverages the robust JSON serialization libraries in .NET, like Newtonsoft.Json or the more recent System.Text.Json library, to effectively read and write objects to and from database storage through JSON serialization.

Leaving Marten's event store functionality aside for another time, let's dive into the document database features as soon as you have Postgresql running locally.

Running Postgresql Locally in Docker

First off, you need a Postgresql database. My preference these days is to just run local development databases in Docker containers so that it's easy to spin up and tear down development databases at will as I switch between codebases. To that end, here's the Docker compose file we use for Marten itself that gives you an empty Postgresql database called marten_testing:

version: '3'
services:
    postgresql:
        image: "ionx/postgres-plv8:12.2"
        ports: - "5432:5432"
        environment:
            POSTGRES_PASSWORD: postgres
            POSTGRES_USER: postgres
            POSTGRES_DB: marten_testing
            NAMEDATALEN: 100

As long as you have Docker Desktop installed on your development box, you'll be able to quickly spin up a new Postgresql database by using this command in the command line application of your choice:

docker compose up -d

Note that you'll need to call that with the terminal location at the same directory that holds your docker-compose.yml file. Likewise, when you're done working with that database, you can shut it down and completely remove the running Docker container with:

docker compose down

QuickStart with Marten

Now, assuming that you have a new project where you want to use Marten, add a NuGet reference to the main Marten library:

dotnet add package Marten

In any application targeting Marten, you need to have a single instance of the DocumentStore class that “knows” how to translate objects back and forth to the underlying database and acts as the main entry point to all Marten-backed persistence. I'll get to more advanced usage quickly in this article, but for right now, spin up a new DocumentStore with all of Marten's default behaviors with this syntax:

// Step 1, build a DocumentStore
var connectionString = "your connection string";
using var store = DocumentStore.For(connectionString);

Now that you have a Marten document store ready to go as you try to build a fictional issue tracking system, let's back up and write a document type to represent an Issue and its constituent tasks like this one:

public class Issue
{
    public Guid Id { get; set; }
    public string Title { get; set; }
    public string Description { get; set; }
    public bool IsOpen { get; set; }
    public IList<IssueTask> Tasks { get; set; } = new List<IssueTask>();
}

In that class, the child IssueTask type is this:

public class IssueTask
{
    public string Title { get; set; }
    public string Description { get; set; }
    public DateTimeOffset? Started { get; set; }
    public DateTimeOffset Finished { get; set; }
}

Now that you have a Marten DocumentStore and your Issue document type, let's write some code to persist an issue:

var issue = new Issue
{
    Title = "Bad Problem",
    IsOpen = true,
    Description = "Need help fast!",
    Opened = DateTimeOffset.UtcNow,
    Tasks = { new(title: "Investigate", description: "Do some troubleshooting") }
};
// start a new IDocumentSession
using var session = store.LightweightSession();
session.Store(issue);
await session.SaveChangesAsync().ConfigureAwait(false);

Let's talk about that code above:

  • I built a new Issue object with a title and description, plus marked it as being open. I also added an initial task within the Issue.
  • I created a new IDocumentSession object (session in the code up above) that you'll use to both query a Marten database and persist changes. The IDocumentSession both implements the unit of work pattern to govern logical transaction boundaries and represents a single connection to the underlying database, thus making it important to ensure that the session is disposed to release the underlying database connection when you're done with the session.
  • I explicitly told Marten that the new Issue document should be persisted as an “upsert” operation.
  • I committed the one pending document change with the call to SaveChangesAsync().

It may be more interesting, so let's talk about what I did not have to do in any of the code above.

I didn't have to write any explicit mapping of the Issue document type to any kind of Postgresql table structure. Marten stores the document data as serialized JSON, so there isn't a lot of code-intensive mapping configuration like you'd frequently hit with Object Relational Mappers like Entity Framework Core or the older NHibernate.

There was no need to first perform any kind of database schema migration to set up the underlying Postgresql database schema. Using Marten's default “development friendly” configuration that you used to construct the DocumentStore up above, Marten quietly builds the necessary database tables and functions to store the Issue documents on behind the scenes the first time you try to write or read Issue documents. As a developer, you can focus on just writing functionality and let Marten deal with the grunt work of building and modifying database schema objects. Again, compare that experience with Marten to the effort you have to make with Object Relational Mapper tools to craft database migration scripts.

Nowhere in the code did I have to assign an identity (primary key) to the new Issue document. Marten's default assumption is that a public property (or field) named Id is the identity for a document type. Because Issue.Id is of type GUID, Marten automatically assigns a sequential GUID for new documents passed into the IDocumentSession.Store() method that don't already have an established identity. In this case, Marten happily sets the value of Id onto the new Issue document in the course of the Store() method.

To illustrate the identity behavior, let's immediately turn around and load a new copy of the new Issue document with this code:

// Now let's reload that issue
var issue2 = await session.LoadAsync<Issue>(issue.Id).ConfigureAwait(false);

So far, you've seen nothing that would be difficult to reproduce on your own. After all, you're just saving and loading data by its primary key, right? To show why you're better off using Marten than writing your own little document store, let's move on quickly to see some of Marten's support for querying documents in the database.

In the past when I've described Marten to other developers, they frequently say “but you can't query within the JSON data itself though, right?” Fortunately, Marten has robust support for LINQ querying that happily queries within the stored JSON data. As an example, let's say that you want to query for the last 10 open issues with a LINQ query:

var openIssues = await session
    .Query<Issue>()
    .Where(x => x.IsOpen)
    .OrderByDescending(x => x.Opened)
    .Take(10)
    .ToListAsync().ConfigureAwait(false);

As I'll show later in this article, it's not only possible to query from within the structured JSON data, but you can also add computed indexes in Marten that work within the stored JSON data.

Admittedly, Marten's LINQ support is short of what you may be used to with Entity Framework Core or the older NHibernate tooling, but all the most common operators and usages of Where() clauses are well supported. Marten also has some specific extensions for LINQ that many users find useful.

Relations Between Documents

The sweet spot for document database approaches, like Marten's, is when the entities are largely self-contained with few relationships between different types of entities. Because Marten is built on top of a relational database engine, it still has the ability to enforce relational integrity between entity types.

To illustrate this, let's introduce a new User document type within our issue tracking system:

public class User
{
    public Guid Id { get; set; }
    public string FirstName { get; set; }
    public string LastName { get; set; }
    public string Role { get; set; }
}

Now, I'd like all of the Issue documents to refer to both an assigned user and to the original user who created the issue. I'll add a pair of new properties to the Issue document:

public class Issue
{
    public Guid Id { get; set; }

    public Guid? AssigneeId { get; set; }
    public Guid? OriginatorId { get; set; }

    // Other properties
}

To create foreign keys from the Issue document type to the new User document type, I need to revisit the DocumentStore bootstrapping from before and use this code to configure Marten:

var connectionString = "your connection string";
using var store = DocumentStore.For(opts =>
    {
        opts.Connection(connectionString);

        // Set up the foreign key relationships
        opts.Schema.For<Issue>()
            .ForeignKey<User>(x => x.AssigneeId)
            .ForeignKey<User>(x => x.OriginatorId);
    });

The introduction of the new User document type and the foreign key relationships from Issue to User will require changes to the underlying database, but not to worry, because Marten detects that for you and happily makes the necessary database changes for you the first time you read or write Issue documents.

Foreign key relationships with Marten will work exactly as you'd expect, if you have any experience with relational databases, as shown in this code:

var issue = new Issue
{
    // reference a non-existent User
    AssigneeId = Guid.NewGuid()
};

session.Store(issue);

// This call will fail!
await session.SaveChangesAsync().ConfigureAwait(false);

Maybe more interesting is the ability in Marten to fetch related documents when querying within one document type. For example, let's say that you're building a Web service where you'll be making the same query for the 10 most recent open issues, but this time, you also need to query for the related User documents for the people assigned to these issues.

You could use two separate queries, like this:

var openIssues = await session.Query<Issue>()
    .Where(x => x.IsOpen)
    .OrderByDescending(x => x.Opened)
    .Take(10)
    .ToListAsync().ConfigureAwait(false);

// Find the related User documents
var userIds = openIssues
    .Where(x => x.AssigneeId.HasValue)
    .Select(x => x.AssigneeId.Value)
    .Distinct()
    .ToArray();

var users = await session
    .LoadManyAsync<User>(userIds)
    .ConfigureAwait(false);

The general rule of thumb for better performance using Marten is to reduce the number of round trips between the application and database server, so let's use Marten's Include() functionality to fetch the related User documents within the same round trip to the database, like this:

// Marten will fill this dictionary for us
var users = new Dictionary<Guid, User>();

var openIssues = await session.Query<Issue>()
    .Where(x => x.IsOpen)
    .OrderByDescending(x => x.Opened)
    .Take(10)
// Marten specific Linq extension
    .Include(x => x.AssigneeId, users)
    .ToListAsync().ConfigureAwait(false);

In the query above, Marten stores the related User documents in the Users dictionary by the User.Id.

As an aside, the Include() operator is specific to Marten (other .NET tools have similar capabilities, and Marten's support was itself inspired by RavenDb's equivalent feature). When using Marten, it's important to consider whether or not any generalized abstraction that you place around Marten to avoid vendor lock-in may eliminate the ability to use the very advanced features of Marten that will make your system perform well.

Unit of Work Transactions with Marten

As stated earlier, the Marten IDocumentSession is an implementation of the unit of work pattern. According to the original statement by Martin Fowler, the unit of work: Maintains a list of objects affected by a business transaction and coordinates the writing out of changes and the resolution of concurrency problems.

Let's jump right into a contrived example that shows an IDocumentSession variable named session used to create and commit a single database transaction that deletes some Issue documents, stores changes to User documents, and stores a brand-new issue in one single transaction:

session.Delete<User>(oldUserId);
session
    .DeleteWhere<Issue>
    (x => x.OriginatorId == fakeUserId);

// store some User documents
session.Store(newAdmin, reporter);

// store a new Issue
session.Store(new Issue
{
    Title = "Help!"
});

await session.SaveChangesAsync()
    .ConfigureAwait(false);

Hopefully, that looks very straightforward, but there are a couple of valuable things to note that set Marten apart from some other alternative document databases:

  • Marten is happily able to process updates to multiple types of documents in one transaction.
  • By virtue of being on top of Postgresql, Marten has ACID-compliant transactional integrity where data is always consistent, as opposed to the BASE model of many other true NoSQL databases where there is said to be “eventual consistency” between the data writes and database queries.

The last point is an important differentiator from other document database approaches and arguably the main reason that Marten exists today, as it was specifically written to replace a true, standalone document database with weak data consistency that was performing poorly in a large production system.

Integration with ASP.Net Core

In real usage, you're most likely going to be using Marten within a .NET application that uses the generic host builder. To that end, recent versions of Marten fully embrace an idiomatic .NET approach to configuring Marten, like this sample from a .NET 6 Web application:

var builder = WebApplication
    .CreateBuilder(args);
builder.Host.ConfigureServices(services =>
{
    var connectionString = builder
        .Configuration
        .GetConnectionString("marten");
    services.AddMarten(opts =>
    {
        opts.Connection(connectionString);
        opts.Schema.For<Issue>()
            .ForeignKey<User>(x => x.AssigneeId)
            .ForeignKey<User>
                (x => x.OriginatorId);
    });
});
// Other configuration

The call to AddMarten() above adds service registrations to the application's Dependency Injection container for:

  • IDocumentStore as singleton scoped
  • IDocumentSession is “scoped” such that you can expect to have a unique session for each HTTP request.
  • IQuerySession (a read-only subset of IDocumentSession) as “scoped”

To show that integration, let's say that you want to create a simple Web service endpoint to create a new Issue with this input body:

public class NewIssue
{
    public Guid UserId { get; set; }
    public string Title { get; set; }
    public string Description { get; set; }
}

And this controller code:

public class CreateUserController : ControllerBase
{
    [HttpPost("/issues/new")]
    public Task PostNewIssue(
        [FromBody] NewIssue body,
        [FromServices] IDocumentSession session)
    {
        var issue = new Issue
        {
            Title = body.Title,
            Description = body.Description,
            OriginatorId = body.UserId,
            IsOpen = true,
            Opened = DateTimeOffset.UtcNow
        };

        session.Store(issue);

        return session.SaveChangesAsync();
    }
}

Because the IDocumentSession is registered as “scoped,” I know that ASP.NET Core itself will be responsible for calling Dispose() on the active session for the HTTP request.

Tracing and Logging

Marten is absolutely meant for “grown up” software development, so we've taken very seriously the role of tracing and logging throughout the Marten codebase. If you bootstrap Marten within a .NET Core application with the AddMarten() method, Marten will be logging all database calls and database errors through the generic .NET ILogger interface.

That's great and all, but now, you might ask, how about automatically tagging documents that are persisted through Marten with the timestamp, the current user, and the correlation ID or trace identifier of the current activity (in the case of the issue-tracking Web application, this will be the trace identifier for the HTTP request). The concept of last modified timestamps is a default behavior in Marten, so that's already taken care of. To add correlation ID and current user name tracking to the document storage, I'm going to break into the Marten configuration and turn on those metadata fields for all documents like so:

var builder = WebApplication.CreateBuilder(args);
builder.Host.ConfigureServices(services =>
{
    services.AddMarten(opts =>
    {
        // Other configuration
        
        // Turn on extra metadata fields for
        // correlation id
        // and last modified by (user name)
        // tracking
        opts.Policies.ForAllDocuments(m =>
        {
            m.Metadata.CorrelationId.Enabled = true;
            m.Metadata.LastModifiedBy.Enabled = true;
        });

    });
});

Making that configuration change tells Marten that the table for each document type now needs an extra column for tracking the correlation ID and last modified by values for each document update. Yet again, Marten now “knows” about the extra metadata fields on each document storage table and automatically adds these columns to any existing tables on the first usage of each specific document type.

The actual values will be assigned from the corresponding IDocumentSession.CorrelationId and IDocumentSession.LastModifiedBy values. To tie all of this together and apply the right values for the currently logged in user and correlation identifier of the session, I'm going to create an implementation of the Marten ISessionFactory interface, as shown in Listing 1.

Listing 1: A custom session factory to incorporate tracing

public class TracedSessionFactory : ISessionFactory
{
    private readonly IDocumentStore _store;
    private readonly HttpContextAccessor _accessor;

    public TracedSessionFactory(IDocumentStore store,
        HttpContextAccessor accessor)
    {
        _store = store;
        _accessor = accessor;
    }

    public IQuerySession QuerySession()
        => _store.QuerySession();

    public IDocumentSession OpenSession()
    {
        var session = _store.LightweightSession();
        session.CorrelationId = _accessor
            .HttpContext?
            .TraceIdentifier;

        session.LastModifiedBy = _accessor
            .HttpContext?.User?.Identity?.Name;

        return session;
    }
}

Lastly, to use the new ISessionFactory, I'll make that active in this code with the BuildSessionsWith<T>() method chained from AddMarten():

var builder = WebApplication.CreateBuilder(args);
builder.Host.ConfigureServices(services =>
{
    services.AddMarten(opts =>
    {
        // Marten configuration
    })

    // Register our custom session factory
    .BuildSessionsWith<TracedSessionFactory>();
});

Optimizing ASP.NET Core Performance with Marten

Inside the issue tracking system, you probably have a simple view somewhere that just shows all the open issues for a given user. Behind that feature, let's say that you've got a simple Web service to get a summary of the open issues. In this case, you only care about the issue title and the actual issue ID to help build links on the client side. That gives you this small DTO for the Web service output:

public class IssueView
{
    public string Title { get; set; }
    public Guid IssueId { get; set; }
}

Next, let's author the simplest possible conceptual controller method to implement the new Web service endpoint for open issues by user:

[HttpGet("/issues/open/user/{userId}")]
public async Task<IReadOnlyList<IssueView>> GetOpenIssues(
    Guid userId, [FromServices] IQuerySession session)
    {
    var issues = await session.Query<Issue>()
        .Where(x => x.AssigneeId == userId
            && x.IsOpen)
        .OrderBy(x => x.Opened)
        .ToListAsync().ConfigureAwait(false);

    // Transform data
    return issues.Select(x => new IssueView
    {
        Title = x.Title,
        IssueId = x.Id
    }).ToList();
}

Honestly, that's probably good enough for most cases, but let's go through some of the facilities in Marten to potentially make that Web service run faster.

Assuming that the issue tracker is going to be a very successful piece of software helping its users support a problematic set of products, you should assume that the Issue document storage table grows very, very large. For optimizing the Web service method above, the obvious first place to start is applying some kind of index against the Issue document to make querying on the AssigneeId property faster. In a previous example, you'd added a foreign key relationship between the Issue.AssigneeId and the User document. When you did that, Marten automatically created a Postgresql index against the Issue.AssigneeId property in addition to the foreign key constraint. If you really wanted to, you can fine-tune that index as shown below:

services.AddMarten(opts =>
{
    // Other Marten configuration
    opts.Schema.For<Issue>()
        // Override the index generated for
        // AssigneeId to use the hash
        // method instead of the
        // default btree
        .ForeignKey<User>(
            x => x.AssigneeId,
            indexConfiguration: idx =>
        {
            idx.Method = IndexMethod.hash;
        })

        .ForeignKey<User>(x => x.OriginatorId);

});

However, if you hadn't already added the foreign key relationship through Marten, you could instead use a computed index, like so:

services.AddMarten(opts =>
{
    // Other Marten configuration
    opts.Schema.For<Issue>()
        // This is a computed index on the
        // Issue.AssigneeId property
        .Index(x => x.AssigneeId);
})

The computed index is indexing within the stored JSONB data within Postgresql and doesn't require any other kind of duplicated field in the table structure. At the cost of somewhat slower writes, indexing the AssigneeId property makes the LINQ query against the Issue document storage in the controller code above faster.

Next up, let's eliminate the need to deserialize the Issue document data and do the in-memory mapping to the IssueView structure. You can simply do a LINQ Select() transform like this:

[HttpGet("/issues/open/user/{userId}")]
public Task<IReadOnlyList<IssueView>>
    GetOpenIssues(Guid userId, [FromServices] IQuerySession session)
{
    return session.Query<Issue>()
        .Where(x => x.AssigneeId == userId && x.IsOpen)
        .OrderBy(x => x.Opened)
        .Select(x => new IssueView
    {
        IssueId = x.Id,
        Title = x.Title
    })
        .ToListAsync();
}

It's important to note here that the transformation from Issue to IssueView completely happens within Postgresql itself. That's helping you return a lot less data over the wire between Postgresql and the Web server while also cutting out an intermediate step of serializing the raw data to the heavier Issue objects.

Now, we .NET developers tend to take LINQ for granted, but after having spent five or six years authoring and helping to support the LINQ provider code within Marten, I can tell you that there's a lot of stuff happening within your average LINQ query:

  • The .NET runtime must build the Expression structure representing the LINQ query.
  • In Marten's case (and this is also true with Entity Framework Core), the Relinq library is used to preprocess the LINQ expression into an intermediate model.
  • A series of custom visitors are sent down the intermediate model and more of the Expression structure to figure out how to build a matching SQL command for the underlying storage engine.
  • There's quite a bit of string manipulation to get to the SQL command.
  • Execute the SQL command and process the results into the form that was specified in the original LINQ query.

Does reading that list kind of make you a little tired? It does me. The point being here that LINQ querying comes with some significant performance overhead. That being said, I'll argue until I'm blue in the face that LINQ is one of the very best features of .NET and a positive differentiator for .NET versus other platforms.

Fortunately, Marten has a feature we call “compiled queries” that lets you have all the good parts of LINQ without incurring the performance overhead. Let's take the LINQ query above and move that to a compiled query class called OpenIssuesByUser, as shown in Listing 2.

Listing 2: Compiled query usage for open issues by user id

public class OpenIssuesByUser : ICompiledListQuery<Issue, IssueView>
{
    public OpenIssuesByUser(Guid userId)
    {
        UserId = userId;
    }

    public Expression<Func<IMartenQueryable<Issue>, 
        IEnumerable<IssueView>>> QueryIs()
    {
        return q => q
            .Where(x => x.AssigneeId == UserId && x.IsOpen)
            .OrderBy(x => x.Opened)
            .Select(x => new IssueView
            {
                IssueId = x.Id,
                Title = x.Title
            });
    }

    public Guid UserId { get; set; }
}

Moving to the compiled query turns the controller method into this code:

[HttpGet("/issues/open/user/{userId}")]
public Task<IEnumerable<IssueView>>
    GetOpenIssues(Guid userId, [FromServices] IQuerySession session)
    => session.QueryAsync(new

OpenIssuesByUser(userId));

With the compiled query, Marten is generating and compiling code at runtime that already “knows” exactly how to issue a SQL command to Postgresql for the query and also how to turn those results into exactly the form that the original LINQ query defined. In a system with quite a bit of traffic, that last change potentially improves performance and scalability overall by eliminating quite a bit of object allocations and CPU-bound overhead by eliminating the repetitive LINQ parsing. It's apt to think of the compiled query feature in Marten as “stored procedures for LINQ queries.”

So far, you've made the database run the query more efficiently by applying a database index to the Issue.AssigneeId property you're querying against, you've eliminated some unnecessary serialization and in-memory transformation by using a LINQ Select() transform, and jumped to a compiled query approach that eliminates some of the overhead of LINQ parsing.

There's still one big piece of performance overhead to eliminate. In that method above, Marten is fetching the exact JSON that you need to send to the Web service client, but first deserializing that JSON to an enumerable of IssueView objects. Then ASP.NET Core turns right around and serializes those objects right back to the exact same JSON and sends that data down the HTTP response body. Fear not! Marten has a facility to bypass the unnecessary deserialize/serialize dance.

First though, you need to install the small Marten.AspNetCore NuGet:

dotnet add package Marten.AspNetCore

That NuGet is going to add a couple extension methods, including the IQuerySession.WriteArray() method you can use to copy, byte by byte, the JSON data being queried from Postgresql right down to the HTTP response body without ever incurring any unnecessary deserialization/serialization overhead or even wasting CPU cycles on creating a JSON string object in memory.

public class OpenIssueController : ControllerBase
{
    [HttpGet("/issues/open/user/{userId}")]
    public Task GetOpenIssues(
        Guid userId,
        [FromServices] IQuerySession session)
        => session.WriteArray(
            new OpenIssuesByUser(userId), HttpContext);
}

Besides trying to be as performant as possible, the WriteArray() method above also takes care of HTTP niceties for you, like setting the content-type and content-length headers with the proper values.

There are far more features in Marten than what I've shown here that can help you wring out more performance in your system, but hopefully this is a good start in showing how robust Marten has become.

Why Marten?

The Marten library has been in production systems since the fall of 2016, but both it and Postgresql have advanced greatly since then. Marten provides all of the developer productivity advantages of a document database but does so without sacrificing transactional integrity. Being just a library on top of the Postgresql database, Marten can be used on all the major cloud hosting options, as well as running locally for developers inside of Docker containers or direct installations. Postgresql itself is a very cost-effective database option with wide community support.

Marten itself has a strong community on GitHub (https://github.com/jasperfx/marten) and the Gitter chat room where you can interact with other users or the Marten core team (https://gitter.im/JasperFx/marten).

Marten started as a document database library with a small, simple event store feature bolted on the side. Fast forward a few years and Marten is becoming a full featured event sourcing solution with a document database thrown in too. In the follow up article to this one, I'd like to do a deep dive on Marten's “event sourcing in a box” feature set.