On Custom Memory Pools/Allocators

When I was searching online for implementation of a custom memory pool, I mainly found references [1] , [2] . In this article from IBM developerworks about Java performance[3] ,we have an argument against Custom memory allocators . Brian Goetz, the guy who wrote the article in 2005,is one of the authors of “Concurrency in Java” , which is a highly-rated book in the Java-world.

That notwithstanding,Custom memory pool management – or the ideas motivating them- are still very relevant . 🙂

Memory allocation strategy in Folly

Sometime in 2012,Facebook open-sourced folly (their C++ library). While I am not sure how successfully Folly has been adopted in the wild outside Facebook, the documents, and design considerations are sure as hell interesting, and worth reading.

For instance over here, you find a description of allocation strategy for strings of lengths in different length range (replacement for std:: string)

And over here, you find a description of vectors. Vectors, when they reach the current max memory, do a realloc(), and allocate double the memory. Mathematically, they show, a factor of growth of 2 is bad, and 1.5 is better. It is information like this that is worth assimilating.

Additionally, here is another great article in Facebook Engineering that gives pointers to the general area of memory allocation, and memory pool management. This article is a must-read, especially with the links that referred within the article. Closely related to the article mentioned, is the paper detailing the implementation of jemalloc (Facebook hired the guy who wrote jemalloc :Quora thread here)

Memcached -Memory Allocation

A reference point for the importance of slab-based memory allocation can be seen in Memcached. Memcached scaled well with libevent, and the memory allocation strategy it takes is that it does not force the release of memory of items that have been evicted from the cache. Memcached uses slab-based allocation, and here’s a resource to know more about the internals of memcached [8]

Here is another nice resource on memory management on Apple’s Developer site [6], which discusses strategies for allocation of large objects, in shared memory, smaller objects etc.

Custom Memory pools- Different Aspects

1)Why custom memory pool management?

(a) A trade-off between run-time performance, and start-up overhead, and need to deal with memory fragmentation.

(b) Avoiding External Memory fragmentation

(c) For several long-running processes, the bottleneck in the code fast-path ( that 80-20 rule : 80 % time is spent in 20 % code), might be doing frequent allocations/freeing. Also, systems where memory is a scarce resource (Embedded Systems)

(d) Freeing up complex nested structures

(e) Reducing cache misses Chiefly with respect to proximity of allocation and accesses.

2) Can there be a formal/mathematical perspective of custom memory pool management?

Here’s an attempt at formalization.

Let X1,X2,X3 … be in non-decreasing order. Let M represent the total memory managed by the custom memory pool.

Consider the set of linear equations :

C1 *X1 = A
C2*X2 =B
C3*X3 =C
Wx1= Wx1+(x1-y) –> If ‘y’ memory is taken from x1 pool)
Wx2=Wx2+ (x2-y) –> If ‘y’ memory is taken from the pool x2 , etc.
W= Wx1+Wx2+Wx3 …..
A+B+C ……L =M ( M being The Total memory managed by the allocator)

In each of those equations, C1*X1 represents ‘C1’ represents the number of elements in the linked list having ‘X1’ as the maximum memory that could be allocated. Whenever a request for size ‘Y’ comes in, we check the bucket in increasing size, if there is a request that could be accommodated in it’s bucket list. And When memory is released, it goes back to the corresponding bucket. Let the set of all these equations be called as ‘S’ . At any given time we have the system going from { S- Y1} or { S+Y2} { corresponding to application consuming ‘Y1’ memory from the pool, and releasing ‘Y2’ memory to the pool } . Correspondingly, the internal fragmentation resulted is reflected in the Wx1,Wx2 equations.

We can see that in the ideal steady-state, the fragmentation incurred should be minimum.

3) What other information might be gained by having custom memory pool management?

Whenever an allocation is done, we can add trailer information to the chunk of memory ( meta-data, application-dependent) having information such as an ‘identifier’ who allocated it, the time it was allocated.

A related relevant question is how to determine if the custom allocator is actually doing its job. Here’s a stackoverflow thread on the same, leading to sites about benchmarks and performance analysis.

4) What are some real-world examples about custom memory management?

Here’s a blog article [7] , where even the comments contained good information about custom-memory-pool management.

(a) Memory management in Postgres – Uses “memory contexts” to easily free memory , instead of freeing each individual chunk in the allotted pool. Refer /src/backend/utils/mmgr/README in Postgresql Source code available here

(b) In general, if allocation and deallocation have to handle objects of widely varying sizes, then custom-memory-pool can be beneficial.

(c)Here’s a write-up and with an example, explaining the use of custom memory pools in Game Engines

(d) In packet-processing and high-traffic network applications, the strategy of memory pools and memory management can be very important. For instance, about Web Proxies- Here ‘s a forum discussion of the strategy adopted by Ngnix . And here is a relevant article on wiki about Region based memory management .

(e) Here’s the github repository on the hoard memory allocator . The documentation here points to a paper which has been cited 300+ times – an indicator that reading the same is essential for deeper knowledge in the area. 🙂 🙂

The Unfinished Bottom Line

It’s 2014, and we still find a lot of current literature discussing memory usage patterns, object size allocation etc. Custom memory pools might not be more everybody,but for a serious programmer, it is something worth investing some time, and understanding the same.

References:

[1] http://stackoverflow.com/questions/7060684/memory-pools-implementation-in-c/7062057#7062057

[2] http://stackoverflow.com/questions/470683/memory-allocation-deallocation-bottleneck

[3] http://www.ibm.com/developerworks/java/library/j-jtp09275/index.html

[4] https://github.com/facebook/folly/blob/master/folly/docs/FBVector.md

[5] http://www.facebook.com/notes/facebook-engineering/scalable-memory-allocation-using-jemalloc/480222803919

[6]Apple Developer Documentation

[7]http://yosefk.com/blog/why-custom-allocatorspools-are-hard.html

[8] https://software.intel.com/en-us/articles/enhancing-the-scalability-of-memcached-0