Pooling Buffers for Better Memory Management

Occasionally, you need a more robust solution to solve a problem.

In my last post, I wrote about the horrors of this small code snippet:

public byte[] Serialize(object o)
{
    using (var stream = new MemoryStream())
    {
        MySerializer.Serialize(stream, o);
        return stream.ToArray();
    }
}

One way to alleviate the memory pressure that can be caused by frequent creation and destruction of large objects is to tell the .Net garbage collector to compact the Large Object Heap (LOH) using:

GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce;

However, this solution, while it may reduce the memory footprint of your application, does nothing to really solve the initial problem of allocating all that memory in the first place. As I mentioned in my last post, one way to accomplish that goal is to use a buffer pool.

The Buffer Manager

Instead of writing my own buffer manager, I am going to use the implementation Microsoft provides in the Windows Communication Framework (WCF) called the BufferManager. Now that WCF is open source you can go look at the implementation of BufferManager here.

The BufferManager essentially allocates a large chunk of contiguous memory and sets it aside for later use. You can create an instance of a BufferManager by calling the CreateBufferManager() static method as follows:

BufferManager.CreateBufferManager(maxPoolSize, maxBufferSize);

The first parameter, maxPoolSize represents how much memory you want the BufferManager to allocate in total. The second parameter, maxBufferSize represents the maximum amount of memory that can be obtained when requesting an individual buffer from the pool.

To ask for a buffer from the pool, you call the TakeBuffer(n) method passing the size of the buffer you need, up to the size specified as the maxBufferSize. To release the buffer back to the pool, you call the ReturnBuffer(buffer) method and pass back the buffer you previously had taken.

Once noteworthy characteristic of the BufferManager is that all the size parameters are converted to a size in a multiples of the powers of 2. That is, if you call TakeBuffer(100000); you will receive a buffer with the size of 2^17 (or 131072) since 2^16 (65535) is smaller than 100,000.

WARNING: Once you return a buffer to a pool, you still have a reference to that memory, so it is advisable to set your buffer to null immediately afterwards to avoid accidentally using it.

BufferManager and MemoryStream Together At Last

It is certainly possible to use the BufferManager and a MemoryStream together and achieve some relief in creating large objects.

For example, to prevent a MemoryStream from allocating its own buffer, you can pass one into its constructor as follows:

var bm = BufferManager.CreateBufferManager(maxPoolSize, maxBufferSize);
var buffer = bm.TakeBuffer(131072);

using (var ms = new MemoryStream(buffer))
{
    // Do work.
}

bm.ReturnBuffer(buffer);
buffer = null;

This approach will prevent MemoryStream from creating an internal buffer and save you both the large object and the memory allocations step.

Not Everything Comes Up Roses

One of the problems with the BufferManager/MemoryStream approach is the fixed buffer size. Of course, MemoryStream can use a smaller portion of the buffer than it needs, but it will not increase capacity. In some scenarios, it may be desirable to have the buffer grow if needed, hence the almost ubiquitous use of MemoryStream’s parameterless constructor.

Another issue with MemoryStream is the the ToArray() method. This method is typically used to return the data in the stream as a byte array. This is done to keep the result separate from the internal buffer; therefore it will allocate a new array and copy the used portion of the buffer to the array. Unfortunately, this could very well generate another large object. Furthermore, you will get the entire buffer back, even if you only wrote to half of it.

One way to work around this issue would be to not use the ToArray() method. Instead, you could use read the data out of the MemoryStreaminto your own array, obtained from the BufferManager. Unfortunately, this would force you to keep track of how much data you wrote to the stream.

var bm = BufferManager.CreateBufferManager(maxPoolSize, maxBufferSize);
var buffer = bm.TakeBuffer(131072);
byte[] output;

using (var ms = new MemoryStream(buffer))
{
    var bytesWritten = MySerializer.Serialize(ms, o);</pre>
<pre><code>// Do more work.

output = bm.TakeBuffer(bytesWritten);
ms.Read(output, 0, bytesWritten);
</code></pre>
<pre>} bm.ReturnBuffer(buffer); buffer = null; // Use 'output' for something. bm.ReturnBuffer(output); output = null;

Since BufferManager will give you a buffer sized to a power of 2, the buffer you receive will probably be larger than bytesWritten. Typically, a serializer will not return the number of bytes they serialized the object into, so this may be a problem for you to solve. It really is unfortunate that MemoryStream will not let you know since it sets the Length to the Capacity when a buffer is provided in construction.

Finally, there is an issue with BufferManager itself: it does not always zero-out the buffer when you take or return one. It is certainly possible to return a buffer and then request a new one that contains data from the previous use of the buffer. Array.Clear() should be pretty fast, but it’s another step you need to do after taking or before returning a buffer if you feel it is necessary.

It’s a Solution, Not The Only Solution

If it seems more trouble than it’s worth to use the BufferManager with a MemoryStream, you may be right. You can fall back to just having the garbage collector clean up after you, or we can look at other alternatives. In a future post, I’ll discuss using a custom implementation to replace the MemoryStream.

To learn more about tuning C# code to run at its most effective, take John Robbins’ Mastering .Net Performance Tuning class, here at Wintellect.