(Speeding up dynamically defined Ruby methods)
Jacob Evelyn & Jemma Issroff
We recently open-sourced MemoWise! It’s a Ruby memoization gem, and if you’re curious you can read more about why we created it and some interesting corners of Ruby we discovered along the way. Now let’s talk about how we made it really, really fast.
Benchmark Driven Development
We began by writing benchmark code for our gem so that at any time we could see how performant it was. We decided to explicitly optimize for memoized value retrieval, so our benchmarks only test the speed at which an already-memoized method’s results are returned on subsequent method calls. Because we memoize many different methods in our codebase, we also ensured our benchmarks cover many parameter types and combinations—from def method1 to def method2(a, *b, c:, **d).
These benchmarks allow us to check the performance impact of any change to our gem’s code. This helps drive our development cycle—when we see scenarios where our gem isn’t as fast as we’d like, we can dig in and then test any potential improvements before committing them. For our initial implementation, the benchmarks showed that all types of memoized methods were slower than we wanted, so we first looked at common code in the core of our gem.
Dynamic method definitions
At the heart of any Ruby memoization gem is a way to dynamically define a new memoized method that caches the results of the original method. Typically this uses define_method and looks something like:
define_method(existing_method_name) do |*args, **kwargs| method_cache = @cache.fetch("#{existing_method_name}") do @cache["#{existing_method_name}"] = {} end key = [args, kwargs] method_cache.fetch(key) do method_cache[key] = super(*args, **kwargs) end end
The block passed to define_method is executed each time the method is called, and Ruby adds overhead that makes blocks slightly less efficient than the equivalent code outside of a block. We expect to call memoized methods many times in performance-sensitive code where these minor block inefficiencies can add up.
Is there a more performant way to dynamically define methods in Ruby? It turns out we can eval strings of raw Ruby code in the context of our original class, and this has significant speedups over define_method:
eval <<-END_OF_METHOD def #{existing_method_name}(*args, **kwargs) method_cache = @cache.fetch("#{existing_method_name}") do @cache["#{existing_method_name}"] = {} end key = [args, kwargs] method_cache.fetch(key) do method_cache[key] = super(*args, **kwargs) end end END_OF_METHOD
Reducing allocations
The above code works, but it allocates objects when it doesn’t always need to! Consider the following method:
def triple(a) a * 3 end
If we memoize triple with the approach above, our memoized version is generated as:
def triple(*args, **kwargs) method_cache = @cache.fetch("triple") do @cache["triple"] = {} end key = [args, kwargs] method_cache.fetch(key) do method_cache[key] = super(*args, **kwargs) end end
Every time we call our memoized triple we allocate an args array, a kwargs hash, and a key array. Each of these allocations is unnecessary! Ideally our generated code would be:
def triple(a) method_cache = @cache.fetch("triple") do @cache["triple"] = {} end method_cache.fetch(a) do method_cache[a] = super(a) end end
By introspecting the arguments of the method we’re memoizing, our generated code can avoid unnecessary allocations in a variety of scenarios, such as:
- Not allocating
argswhen there are only keyword arguments - Not allocating
kwargswhen there are only positional arguments - Not allocating
keywhen there’s only one argument - Not using
method_cachewhen there are no arguments at all
Our benchmarks showed that these optimizations yield dramatic performance improvements and help make the gem so fast.
Please feel free to try MemoWise out or take a peek at the code, and if you have any other performance suggestions let us know. (We’re happy to accept contributions and we’d love to hear what you think!)
To follow my next adventure,
or use RSS.