Writing High-Performance Code Using Span<T> and Memory<T> in C#

In this article, you'll be introduced to the new types introduced in C# 7.2: Span and Memory. I'll take a deep dive into Span<T> and Memory<T> and demonstrate how to work with them in C#.

Prerequisites

If you're to work with the code examples discussed in this article, you need the following installed in your system:

Visual Studio 2022
.NET 6.0
ASP.NET 6.0 Runtime

If you don't already have Visual Studio 2022 installed on your computer, you can download it from here: https://visualstudio.microsoft.com/downloads/.

Types of Memory Supported in .NET

Microsoft .NET enables you to work with three types of memory that include:

Stack memory: Resides in the Stack and is allocated using the stackalloc keyword
Managed memory: Resides in the heap and is managed by the GC
Unmanaged memory: Resides in the unmanaged heap and is allocated by calling the Marshal.AllocHGlobal or Marshal.AllocCoTaskMem methods

Newly Added Types in .NET Core 2.1

The newly introduced types in .NET Core 2.1 are:

System.Span: This represents a continuous section of arbitrary memory in a type-safe and memory-safe manner.
System.ReadOnlySpan: This represents a type-safe and memory-safe read-only representation of an arbitrary contiguous area of memory.
System.Memory: This represents a contiguous memory area.
System.ReadOnlyMemory: Similar to ReadOnlySpan, this type represents a continuous section of memory. However, unlike ReadOnlySpan, it's not a ByRef type.

Accessing Contiguous Memory: Span and Memory

You might often need to work with massive volumes of data in your applications. String handling is critical in any application because you must follow the recommended practices to avoid unnecessary allocations. You can use unsafe code blocks and pointers to directly manipulate memory, but this approach has considerable risks involved. Pointer manipulations are prone to bugs such as overflows, null-pointer accesses, buffer overruns, and dangling pointers. If the bug affects only the stack or static memory areas, it will be harmless; but if it affects critical system memory areas, it may cause your application to crash. Enter Span<T> and Memory<T>.

Span<T> and Memory<T> have been newly introduced in .NET. They provide a type-safe way to access contiguous regions of arbitrary memory. Both Span<T> and Memory<T> are a part of the System namespace and represent a contiguous block of memory, sans any copy semantics. Span<T>, Memory<T>, ReadOnlySpan, and ReadOnlyMemory types have been newly added to C# and can help you to work with memory directly in a safe and performant matter.

These new types are part of the System.Memory namespace and are intended to be used in high performance scenarios where you need to process large amounts of data or want to avoid unnecessary memory allocations, such as when working with buffers. Unlike array types that allocate memory on the GC heap, these new types provide an abstraction over contiguous regions of arbitrary managed or native memory without allocating on the GC heap.

The Span<T> and Memory<T> structs provide low-level interfaces to an array, string, or any contiguous managed or unmanaged memory block. Their primary function is to foster micro-optimization and write low-allocation code that reduces managed memory allocations, thus decreasing the strain on the garbage collector. They also allow for slicing or dealing with a section of an array, string, or memory block without duplicating the original chunk of memory. Span<T> and Memory<T> are very beneficial in high-performance areas, such as the ASP.NET 6 request-processing pipelines.

An Introduction to Span

Span<T> (earlier known as Slice) is a value type introduced in C# 7.2 and .NET Core 2.1 with almost zero overhead. It provides a type-safe way to work with a contiguous block of memory such as:

Arrays and subarrays
Strings and substrings
Unmanaged memory buffers

A Span type represents a contiguous chunk of memory that resides in the managed heap, the stack, or even in unmanaged memory. If you create an array of a primitive type, it's allocated on the stack and doesn't require garbage collection to manage its lifetime. Span<T> is capable of pointing to a chunk of memory allocated whether on the stack or on the heap. However, because Span<T> is defined as a ref struct, it should reside only on the stack.

The following are the characteristics of Span<T> at a glance:

Value type
Low or zero overhead
High performance
Provides memory and type safety

You can use Span with any of the following

Arrays
Strings
Native buffers

The list of types that can be converted to Span<T> are:

Arrays
Pointers
IntPtr
stackalloc

You can convert all of the following to ReadOnlySpan<T>:

Arrays
Pointers
IntPtr
stackalloc
string

Span<T> is a stack-only type; precisely, it's a ByRef type. Thus, spans can neither be boxed nor appear as fields of stack-only type, nor can they be used in generic parameters. However, you can use spans to represent return values or method arguments. Refer to the code snippet given below that illustrates the complete source code of the Span struct:

public readonly ref struct Span<T> 
{
    internal readonly
    ByReference<T> _pointer;
    private readonly int _length;
    //Other members
}

You can take a look at the complete source code of the struct Span<T> here: https://github.com/dotnet/corefx/blob/master/src/Common/src/CoreLib/System/Span.cs.

The Span<T> source code shows that it basically comprises two read-only fields: a native pointer and a length property denoting the number of elements that the Span contains.

Span may be used in the same ways that an array can. However, unlike arrays, it can refer to stack memory, i.e., memory allocated on the stack, managed memory, and native memory. This provides an easy way for you to take advantage of performance improvements that were previously only available when dealing with unmanaged code.

Here's how Span<T> is declared in the System namespace.

public readonly ref struct Span<T>

To create an empty Span, you can use the Span.Empty property:

Span<char> span = Span<char>.Empty;

The following code snippet shows how you can create a byte array in the managed memory and then create a span instance out of it.

var array = new byte[100];
var span = new Span<byte>(array);

Programming Span in C#

Here's how you can allocate a chunk of memory in the stack and use a Span to point to it:

Span<byte> span = stackalloc byte[100];

The following code snippet shows how you can create a Span using a byte array, store integers inside the byte array, and calculate the sum of all the integers stored.

var array = new byte[100];
var span = new Span<byte>(array);

byte data = 0;
for (int index = 0; index < span.Length; index++)
    span[index] = data++;

int sum = 0;
foreach (int value in array)
    sum += value;

The following code snippet creates a Span from the native memory:

var nativeMemory = Marshal.AllocHGlobal(100);
Span<byte> span;
unsafe
{
    span = new Span<byte>(nativeMemory.ToPointer(), 100);
}

You can now use the following code snippet to store integers inside the memory pointed to by the Span and display the sum of all the integers stored:

byte data = 0;
for (int index = 0; index < span.Length; index++)
    span[index] = data++;

int sum = 0;
foreach (int value in span) 
    sum += value;

Console.WriteLine ($"The sum of the numbers in the array is {sum}");
Marshal.FreeHGlobal(nativeMemory);

You can also allocate a Span in the stack memory using the stackalloc keyword, as shown below:

byte data = 0;
Span<byte> span = stackalloc byte[100];

for (int index = 0; index < span.Length; index++)
    span[index] = data++;

int sum = 0;
foreach (int value in span) 
    sum += value;

Console.WriteLine ($"The sum of the numbers in the array is {sum}");

Remember to enable compilation of unsafe code in your project. To do this, right-click on your project, click Properties, and check the Unsafe code checkbox, as shown in Figure 1.

Figure 1: Turn on unsafe compilation for your project to enable unsafe code.

Span and Arrays

Slicing enables data to be treated as logical chunks that can then be processed with minimal resource overhead. Span<T> can wrap an entire array and, because it supports slicing, you can make it point to any contiguous region within the array. The following code snippet shows how you can use a Span<T> to point to a slice of three elements within the array.

int[] array = new int[] { 1, 2, 3, 4, 5, 6, 7, 8, 9 } ;
Span<int> slice = new Span<int>(array, 2, 3);

There are two overloads of the Slice method available as part of the Span<T> struct, allowing slices to be created based on an index. This allows the Span<T> data to be treated as a series of logical chunks that can be processed individually or as desired by sections of a data processing pipeline.

You can use Span<T> to wrap an entire array. Because it supports slicing, it can not only point to the first element of the array, but any contiguous range of elements within the array.

foreach (int i in slice)
    Console.WriteLine($"{i} ");

When you execute the preceding code snippet, the integers present in the sliced array will be displayed at the console, as shown in Figure 2.

Figure 2: Integers present in the sliced array displayed at the console window

Span and ReadOnlySpan

A ReadOnlySpan<T> instance is often used to refer to array items or a chunk of an array. As opposed to arrays, ReadOnlySpan<T> instances can refer to native memory, managed memory, or stack memory. Both Span<T> and ReadOnlySpan<T> provide a type-safe representation of a contiguous region of memory. Although Span<T> provides a read-write access to a region of memory, ReadOnlySpan<T> provides a read-only access to a memory segment.

The following code snippet illustrates how you can use ReadOnlySpan to slice a portion of a string in C#:

ReadOnlySpan<char> readOnlySpan = "This is a sample data for testing purposes.";
int index = readOnlySpan.IndexOf(' ');
var data = ((index < 0) ?
    readOnlySpan : readOnlySpan.Slice(0, index)).ToArray();

An Introduction to Memory

Memory<T> is a reference type that represents a contiguous region of memory and has a length, but doesn't necessarily start at index 0 and can be one of many regions inside another Memory. The memory represented by the Memory might not even be your process's own, as it could have been allocated in unmanaged code. Memory is useful for representing data in non-contiguous buffers because it allows you to treat them like a single contiguous buffer without copying.

Here's how Memory<T> is defined:

public struct Memory<T> 
{
    void* _ptr;
    T[]   _array;
    int   _offset;
    int   _length;

    public Span<T> Span => _ptr == null ? new Span<T>(_array, _offset, _length) : new Span<T>(_ptr, _length);
}

In addition to Span<T>, Memory<T> provides a safe and sliceable view into any contiguous buffer, whether an array or a string. Unlike Span<T>, it has no stack-only constraints because it's not a ref-like type. As a result, you can place it on the heap, use it in collections or with async-await, save it as a field, or box it, just as you would any other C# struct.

The Span<T> property allows you to get efficient indexing capabilities when you need to modify or process the buffer referenced by Memory<T>. On the contrary, Memory<T> is a more general-purpose and high-level exchange type than Span<T> with an immutable, read-only counterpart named ReadOnlyMemory<T>.

Although both Span<T> and Memory<T> represent a contiguous chunk of memory, unlike Span<T>, Memory<T> is not a ref struct. So, contrary to Span<T>, you can have Memory<T> anywhere on the managed heap. Hence, you don't have the same restrictions in Memory<T> as you do in Span<T>. And you can use Memory<T> as a class field, and across await and yield boundaries.

ReadOnlyMemory

Similar to ReadOnlySpan<T>, ReadOnlyMemory<T> represents read only access to a contiguous region of memory but unlike a ReadOnlySpan<T>, it isn't a ByRef type.

Now refer to the following string that contains country names separated by space characters.

string countries = "India Belgium Australia USA UK Netherlands";
var countries = ExtractStrings("India Belgium Australia USA UK Netherlands".AsMemory());

The ExtractStrings method extracts each of the country names as shown below:

public static IEnumerable
<ReadOnlyMemory <char>> ExtractStrings(ReadOnlyMemory<char> c)
{
    int index = 0, length = c.Length;
    for (int i = 0; i < length; i++)
    {
        if (char.IsWhiteSpace(c.Span[i]) || i == length)
        {
            yield return c[index..i];
            index = i + 1;
        }
    }
}

You can call the above method and display the country names at the console window using the following code snippet:

var data = ExtractStrings(countries.AsMemory());
foreach(var str in data)
    Console.WriteLine(str);

Advantages of Span and Memory

The main advantage of using the Span and Memory types is improved performance. You can allocate memory on the stack by using the stackalloc keyword, which allocates an uninitialized block that's an instance of type T[size]. This isn't necessary if your data is already on the stack, but for large objects, it's useful because arrays allocated in this way exist only for as long as their scope lasts. If you're using a heap-allocated arrays, you can pass them through a method like Slice() and create views without copying any data.

Here are some more advantages:

They reduce the number of allocations for the garbage collector. They also reduce the number of copies of the data and provide a more efficient way to work with multiple buffers at once.
They allow you to write high performance code. For example, if you have a large chunk of memory that you need to divide into smaller pieces, use Span as a view of the original memory. This allows your app to directly access the bytes from the original buffer without making copies.
They allow you to directly access memory without copying it. This can be particularly useful when working with native libraries or interop with other languages.
They allow you to eliminate bounds checking in tight loops where performance is critical (such as cryptography or network packet inspection).
They allow you to eliminate boxing and unboxing costs associated with generic collections, like List.
They enable writing code that is easier to understand by using a single data type (Span) rather than two different types (Array and ArraySegment).

Contiguous and Non-Contiguous Memory Buffers

A contiguous memory buffer is a block of memory that holds the data in sequentially adjacent locations. In other words, all of the bytes are next to each other in memory. An array represents a contiguous memory buffer. For example:

int[] values = new int[5];

The five integers in the above example will be placed in five sequential locations in memory starting with the first element (values[0]).

In contrast to contiguous buffers, you can use non-contiguous buffers for cases where there are multiple blocks of data that aren't located next to one another or when working with unmanaged code. Span and Memory types were designed specifically for non-contiguous buffers and provide convenient ways to work with them.

A non-contiguous region of memory has no guarantee that the elements are stored in any particular order or that they're stored close together in memory. Non-contiguous buffers, such as ReadOnlySequence (when used with segments), reside in separate areas of memory that may be scattered across the heap and cannot be accessed by a single pointer.

For example, IEnumerable is non-contiguous because there's no way to know where the next item will be until you have enumerated over each one individually. In order to represent these gaps between segments, you must use additional data to track where each segment starts and ends.

Discontiguous Buffers: ReadOnlySequence

Let's assume that you're working with a buffer that is not contiguous. For example, the data might be coming from a network stream, database call, or file stream. Each of these scenarios can have multiple buffers of varying sizes. A single ReadOnlySequence instance can contain one or more segments of memory and each segment can have its own Memory instance. Therefore, a single ReadOnlySequence instance allows for better management of available memory and provides better performance than many concatenated Memory instances.

You can create a ReadOnlySequence instance using the factory method Create() on the SequenceReader class as well as other methods such as AsReadOnlySequence(). The Create() method has several overloads that allow you to pass in byte[] or ArraySegment, sequence of byte arrays (IEnumerable), or IReadOnlyCollection/IReadOnlyList/IList/ICollection collections of byte arrays (byte[]) and ArraySegment.

You now know that Span<T> and Memory<T> provide support for contiguous memory buffers such as arrays. The System.Buffers namespace contains a struct called ReadOnlySequence<T> that provides support for working with discontiguous memory buffers. The following code snippet illustrates how you can work with ReadOnlySequence<T> in C#:

int[] array = { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
var readOnlySequence = new ReadOnlySequence<int>(array);
var slicedReadOnlySequence = readOnlySequence.Slice(1, 5);

You can also use ReadOnlyMemory<T>, as shown below:

int[] array = { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
ReadOnlyMemory<int> memory = array;
var readOnlySequence = new ReadOnlySequence<int>(memory);
var slicedReadOnlySequence = readOnlySequence.Slice(1,  5);

A Real-Life Example

Let's now talk about a real-life problem and how Span<T> and Memory<T> can help. Consider the following array of strings that contains log data retrieved from a log file:

string[] logs = new string[]
{
    "a1K3vlCTZE6GAtNYNAi5Vg::05/12/2022 09:10:00 AM::http://localhost:2923/api/customers/getallcustomers",
    "mpO58LssO0uf8Ced1WtAvA::05/12/2022 09:15:00 AM::http://localhost:2923/api/products/getallproducts",
    "2KW1SfJOMkShcdeO54t1TA::05/12/2022 10:25:00 AM::http://localhost:2923/api/orders/getallorders",
    "x5LmCTwMH0isd1wiA8gxIw::05/12/2022 11:05:00 AM::http://localhost:2923/api/orders/getallorders",
    "7IftPSBfCESNh4LD9yI6aw::05/12/2022 11:40:00 AM::http://localhost:2923/api/products/getallproducts"
};

Remember, you can have millions of log records, so performance is critical. This example is just an extract of log data from massive volumes of log data. The data for each of the rows comprise of the HTTP Request ID, the DateTime of the HTTP Request, and the endpoint URL. Now suppose you need to extract the request ID and the endpoint URL from this data.

You need a solution that's high performant. If you use the Substring method of the String class, many string objects will be created and it would degrade the performance of your application as well. The best solution is to use a Span<T> here to avoid allocations. The solution to this is using Span<T> and the Slice method as illustrated in the next section.

Benchmarking Performance

It's time for some measurements. Let's now benchmark the performance of Span<T> struct versus the Substring method of the String class.

Create a New Console Application Project in Visual Studio 2022

Let's create a console application project that you'll use for benchmarking performance. You can create a project in Visual Studio 2022 in several ways. When you launch Visual Studio 2022, you'll see the Start window. You can choose Continue without code to launch the main screen of the Visual Studio 2022 IDE.

To create a new Console Application Project in Visual Studio 2022:

Start the Visual Studio 2022 IDE.
In the Create a new project window, select Console App, and click Next to move on.
Specify the project name as HighPerformanceCodeDemo and the path where it should be created in the Configure your new project window.
If you want the solution file and project to be created in the same directory, you can optionally check the Place solution and project in the same directory checkbox. Click Next to move on.
In the next screen, specify the target framework you would like to use for your console application.
Click Create to complete the process.

You'll use this application in the subsequent sections of this article.

Install NuGet Package(s)

So far so good. The next step is to install the necessary NuGet Package(s). To install the required packages into your project, right-click on the solution and the select Manage NuGet Packages for Solution…. Now search for the package named BenchmarkDotNet in the search box and install it. Alternatively, you can type the commands shown below at the NuGet Package Manager Command Prompt:

PM> Install-Package BenchmarkDotNet

Benchmarking Span Performance

Let's now examine how to benchmark the performance of Substring and Slice methods. Create a new class named BenchmarkPerformance with the code in Listing 1. You should note how data has been set up in the GlobalSetup method and the usage of the GlobalSetup attribute.

Listing 1: Setting up the benchmark data

[MemoryDiagnoser]
[Orderer(BenchmarkDotNet.Order.SummaryOrderPolicy.FastestToSlowest)]
[RankColumn]

public class BenchmarkPerformance
{
    [Params(100, 200)]
    public int N;

    string countries = null;
    int index, numberOfCharactersToExtract;

    [GlobalSetup]
    public void GlobalSetup()
    {
        countries = "India, USA, UK, Australia, Netherlands, Belgium";
        index = countries.LastIndexOf(",",StringComparison.Ordinal);
        numberOfCharactersToExtract = countries.Length - index;
    }
}

Now, write the two methods named Substring and Span, as shown in Listing 2. While the former retrieves using the last country name with the Substring method of the String class, the latter extracts the last country name using the Slice method.

Listing 2: The Substring and Span methods

[Benchmark]
public void Substring()
{
    for(int i = 0; i < N; i++)
    {
        var data = countries.Substring(index + 1, numberOfCharactersToExtract - 1);
    }
}

[Benchmark(Baseline = true)]
public void Span()
{
    for(int i=0; i < N; i++)
    {
       var data = countries.AsSpan().Slice(index + 1, numberOfCharactersToExtract - 1);
    }
}

The complete source code of the BenchmarkPerformance class is provided for your reference in Listing 3.

Listing 3: The complete source code

[MemoryDiagnoser]
[Orderer(BenchmarkDotNet.Order.SummaryOrderPolicy.FastestToSlowest)]
[RankColumn]

public class BenchmarkPerformance
{
    [Params(100, 200)]
    public int N;

    string countries = null;
    int index, numberOfCharactersToExtract;

    [GlobalSetup]
    public void GlobalSetup()
    {
        countries = "India, USA, UK, Australia, Netherlands, Belgium";
        index = countries.LastIndexOf(",",StringComparison.Ordinal);
        numberOfCharactersToExtract = countries.Length - index;
    }

    [Benchmark]
    public void Substring()
    {
        for(int i = 0; i < N; i++)
        {
            var data = countries.Substring(index + 1, numberOfCharactersToExtract - 1);
        }
    }

    [Benchmark(Baseline = true)]
    public void Span()
    {
        for(int i=0; i < N; i++)
        {
            var data = countries.AsSpan().Slice(index + 1, numberOfCharactersToExtract - 1);
        }
    }
}

Executing the Benchmarks

Write the following piece of code in the Program.cs file to run the benchmarks:

using HighPerformanceCodeDemo;
using System.Runtime.InteropServices;
class Program
{
    static void Main(string[] args)
    {
        BenchmarkRunner.Run<BenchmarkPerformance>();
    }
}

To execute the benchmarks, set the compile mode of the project to Release and run the following command in the same folder where your project file resides:

dotnet run -p HighPerformanceCodeDemo.csproj -c Release

Figure 3 shows the result of the execution of the benchmarks.

Figure 3: Benchmarking Span (Slice) vs Substring performance

Interpreting the Benchmarking Results

As you can see in Figure 3, there's absolutely no allocation when you're using the Slice method to extract the string. For each of the benchmarked methods, a row of the result data is generated. Because there are two benchmark methods, there are two rows of benchmark result data. The benchmark results show the mean execution time, Gen0 collections, and the allocated memory. As is evident from the benchmark results, Span is more than 7.5 times faster than the Substring method.

Limitations of Span

Span<T> is stack-only, which means it's inappropriate for storing references to buffers on the heap, as in routines performing asynchronous calls. It's not allocated in the managed heap but on the stack and it doesn't support boxing to prevent promotion to the managed heap. You can't use Span<T> as a generic type but you can use it as a field type in a ref struct. You can't assign Span<T> to variables of type dynamic, object, or any other interface type. You can't use Span<T> as fields in a reference type, nor can you use it across await and yield boundaries. Additionally, because Span<T> doesn't inherit IEnumerable, you can't use LINQ with it.

It's important to note that you can't have a Span<T> field in a class, create an array of a Span<T>, or box a Span<T> instance. Note that neither Span<T> nor Memory<T> implement IEnumerable<T>. So, you wouldn't be able to use LINQ operations with either of these. However, you can take advantage of SpanLinq or NetFabric.Hyperlinq to get around this limitation.

Conclusion

In this article I examined the features and benefits of Span<T> and Memory<T> and how you can implement them in your applications. I also discussed a real-life scenario where Span<T> can be used to improve string handling performance. Note that Span<T> is more versatile and better performant than Memory<T> but it isn't a complete replacement of it.