Is Your Application Running with Ruby

The problem that this post describes has now been fixed in RVM stable (1.17.0). To diagnose performance issues, be sure to check the options that Ruby was compiled with:
ruby -rrbconfig -e 'puts RbConfig::CONFIG["CFLAGS"]'

The Situation

While working with Patrick Bacon, migrating a large Ruby (ruby-1.9.2) web application from an older Solaris system to a new Linux system (Ubuntu 12.04 LTS), we discovered that the migrated web application seemed to be responding rather sluggishly. Comparing the logs, we found that the application was running, generally, about twice as slow on the new system as on the old system.

My first thought was hardware — the new system must be slower. Both systems were cloud servers, but hosted with different companies. Even though the Linux system was hosted on a much newer cloud platform, there was no guarantee that the underlying hardware was new or fast. But, at the same time, I just could not believe that the new system would be twice as slow.

The Diagnosis

I decided to run some ultra-simple benchmarks on both the old Solaris system, and the new Linux system:


jk@linux ~ $ which ruby
/usr/local/rvm/rubies/ruby-1.9.2-p180/bin/ruby
jk@linux ~ $ time ruby -e "count = 0; while(count < 100000000); count = count + 1; end; puts count"                                                                                                         
100000000

real    0m9.019s
user    0m8.933s
sys     0m0.016s

solaris [~]# which ruby
/opt/ao_ruby192/bin/ruby
solaris [~]# time ruby -e "count = 0; while(count < 100000000); count = count + 1; end; puts count"                 
100000000

real    0m2.959s
user    0m2.928s
sys 0m0.011s

Clearly, Ruby was running much slower. However, I still doubted that it was really that much slower because of the hardware. I decided to run some of my simple benchmarks using Bash and Perl. (Note: The versions of Bash and Perl differed slightly between the Linux and Solaris systems, whereas the version of Ruby was the same between the two systems).


jk@linux ~ $ time perl -e '$count = 0; while ($count < 10000000) { $count ++; } print $count;'
10000000
real    0m0.856s
user    0m0.840s
sys     0m0.008s

solaris [~]# time perl -e '$count = 0; while ($count < 10000000) { $count ++; } print $count;'
10000000
real    0m1.052s
user    0m1.035s
sys     0m0.006s

jk@linux ~ $ time bash -c 'i=0;while [ $i -le 1000000 ]; do i=$(($i+1)); done; echo $i;'
1000001

real    0m17.906s
user    0m15.677s
sys     0m2.212s

solaris [~]# time bash -c 'i=0;while [ $i -le 1000000 ]; do i=$(($i+1)); done; echo $i;'
1000001

real    0m18.061s
user    0m16.820s
sys 0m1.177s

Going by the results of my benchmark using Bash and Perl, the new Linux system was actually faster than the old Solaris system. Something was wrong with how Ruby was running. It was most likely compiled differently between the two systems… and it was compiled with RVM.

I decided to download Ruby directly (e.g. svn co), and compile from source by myself. I did so with no extra configuration options, only options necessary to get Ruby to successfully compile.


jk@linux /opt/ruby192 $ time bin/ruby -e "count = 0; while(count < 100000000); count = count + 1; end; puts count"
100000000

real    0m3.839s
user    0m3.816s
sys 0m0.028s

This self-compiled version ran consistently faster than the version compiled by RVM on the same system.

(8.933 – 3.816) / 8.933 = 0.5728

That’s a whopping 57.28% decrease in execution time.

So… what had happened to version of Ruby compiled by RVM? Why was it so slow?

The Solution

After conferring with John Van Enk and checking the make.log file, we found that the RVM version of Ruby had been compiled without any optimization. This explained the major performance difference. The solution was fairly simple — we instructed RVM to re-compile Ruby with some specific optimization flags: namely, -O3.

Unfortunately, RVM doesn’t seem to recognize CFLAGS options on the command line. Instead, I needed to add them to the local .rvmrc file as: rvm_configure_env=(CFLAGS=-O3). After adding this and re-compiling Ruby with RVM, I was able to achieve comparable performance to the Ruby I had compiled myself.


jk@linux ~ $ which ruby
/usr/local/rvm/rubies/ruby-1.9.2-p180/bin/ruby
jk@linux ~ $ time ruby -e "count = 0; while(count < 100000000); count = count + 1; end; puts count"
100000000

real    0m3.922s
user    0m3.908s
sys 0m0.020s

RVM seems to compile Ruby with optimizations on several other systems (including my local machine), but did not for this Linux system. I have no idea why this happens (would someone like to tell me?). But, at least I know that this is something that I can check and correct in the future when deploying production applications.