Crafting Ruby for Performance

May 9, 2014

Craftsman making vase from fresh wet clay on pottery wheel

In the past, Ruby had a reputation of being a slow programming language. But with the advent of Ruby 2.0+, this is now a myth. In this article, we’ll investigate Ruby’s MRI performance and what this means for your programs.

First, I would like to start by mentioning a few good practices in OOP:

Identify key abstractions to turn into “objects”.
Encapsulate each object into separate responsibilities with unique concerns.
Assign a single purpose to each encapsulation and further isolate the responsibility.
Slap yourself in the forehead when an abstraction gets overburdened with too many responsibilities.
Repeat.

Second, I would like to emphasize that these practices have implications on performance. Building a monolithic do-it-all application can be fun and challenging. It may even run on a single procedure! But if you expose many convoluted responsibilities to a simple request, your end user will experience that.

I once heard a wise sage say, “If your web app performance sucks, it is your fault.”

Let’s investigate how Ruby performance stacks up against these best practices, shall we?

Local Variables

For my benchmarks, I’m going to be using a “lab rat” with these specs:

describe "A labrat" do
  it "has binding information" do
    assert LabRat.get_binding.kind_of?(Binding)
  end

  it "is a class" do
    assert LabRat.kind_of?(Class)
  end
end

I also need a way to get unique variable names so, try:

describe "A unique name" do
  it "is in fact unique" do
    names = {}
    20.times{ names[get_unique_name] = 1 }
    assert names.keys.length == 20
  end
end

I’ll let you imagine what it takes to pass these tests.

Now for my benchmark. In this case, I’ll be focusing on local variables:

b = LabRat.get_binding
Benchmark.bm do |bench|
  bench.report "create #{Benchmarks::T_DATA} local variables" do
    Benchmarks::T_DATA.times do
      eval("#{get_unique_name} = 1", b)
    end
  end
end

lvarr = eval("local_variables", b)
Benchmark.bm do |bench|
  bench.report "retrieve #{Benchmarks::T_DATA} local variables" do
    Benchmarks::T_DATA.times do
      eval("#{lvarr.sample}", b)
    end
  end
end

There is a lot of magic going on here. I’m reopening my lab rat and stuffing it with a bunch of local variables. Then I benchmark how Ruby performs.

So here are the results:

user      system    total     real
create 5000 local variables    0.330000  0.000000  0.330000  (0.332978)
                               user      system    total     real
retrieve 5000 local variables  0.450000  0.010000  0.460000  (0.448220)

Creating takes about the same as looking up local variables. This could be due to overhead I add in binding to my variables.

It takes less than a second to create and retrieve 5,000 local variables using my 2013 MacBook Air. Not bad.

So, what about instance variables?

Instance Variables

With all the unit tests and benchmark foundation out of the way. Let’s go into the code:

o = LabRat.new
Benchmark.bm do |bench|
  bench.report "create #{Benchmarks::T_DATA} instance variables" do
    Benchmarks::T_DATA.times do
      o.instance_eval("@#{get_unique_name} = 1")
    end
  end
end

ivarr = o.instance_variables
Benchmark.bm do |bench|
  bench.report "retrieve #{Benchmarks::T_DATA} instance variables" do
    Benchmarks::T_DATA.times do
      o.instance_eval("#{ivarr.sample}")
    end
  end
end

I am using instance_eval to stuff my lab rat with a bunch of instance variables. Let’s see those numbers:

user      system    total     real
create 5000 instance variables    0.170000  0.000000  0.170000  (0.165712)
                                  user      system    total     real
retrieve 5000 instance variables  0.030000  0.000000  0.030000  (0.033587)

Fascinating. Instance variables are two or three times faster than local variables. These benchmarks are starting to tell a story.

It will be interesting to see how class variables perform. Just from these two benchmarks, it seems the odds are in favor of encapsulating your code. Encapsulation is an important part of OOP. The fact that instance variables are superior to local variables says Ruby prefers you write code in this way.

But, what about class variables?

Class Variables

Moving along, let’s see some of that benchmark code:

Benchmark.bm do |bench|
  bench.report "create #{Benchmarks::T_DATA} class variables" do
    Benchmarks::T_DATA.times do
      LabRat.class_eval("@@#{get_unique_name} = 1")
    end
  end
end

cvarr = LabRat.class_variables
Benchmark.bm do |bench|
  bench.report "retrieve #{Benchmarks::T_DATA} class variables" do
    Benchmarks::T_DATA.times do
      LabRat.class_eval("#{cvarr.sample}")
    end
  end
end

Just like the previous example, I am stuffing my little guy with a bunch of class variables. The magic happens through class_eval to create my variables. Let’s see those results:

user      system    total     real
create 5000 class variables    0.160000  0.010000  0.170000  (0.163653)
                               user      system    total     real
retrieve 5000 class variables  0.030000  0.000000  0.030000  (0.035327)

It looks like class variables perform about the same as instance variables. Now, it is time for me to tell you a story.

In OOP, as mentioned, encapsulation is a big deal. It is the bread and butter of what drives your abstractions. To keep Ruby performance up to snuff, you want to make sure you stay away from large methods. A good rule of thumb here is: If you find yourself writing too many local variables, it’s time to break it up. A good design principle here is to isolate a single concern into one method. The encapsulation can orchestrate responsibilities across concerns.

Wrapping up, let’s take a quick look at hashes and arrays. Since we’ve already covered variables, we can expect that they, too, follow a similar pattern.

So, let’s verify.

Arrays

There are minor differences with the code. I’ll summarize it here.

To create local arrays you do:

name = get_unique_name
eval("#{name} = []; #{name} << 1", b)

Local arrays are retrievable with:

laarr = eval("local_variables", b)
name = eval("#{laarr.sample}", b)
eval("#{name}[0]", b)

Having to assign a local variable name does add some overhead. Let’s see those results:

user      system    total     real
create 5000 local arrays    0.360000  0.000000  0.360000  (0.356111)
                            user      system    total     real
retrieve 5000 local arrays  0.500000  0.010000  0.510000  (0.508840)

Not bad, about the same as my local variables.

Now, to create instance arrays you do:

o.instance_eval("@#{name} = []; @#{name} << 1")

Instance arrays are retrievable with:

iaarr = o.instance_variables
o.instance_eval("#{iaarr.sample}[0]")

The results:

user     system     total     real
create 5000 instance arrays    0.230000  0.000000  0.230000  (0.229147)
                               user      system    total     real
retrieve 5000 instance arrays  0.050000  0.000000  0.050000  (0.047157)

Slower than my instance variables. But still, much faster than my local arrays.

Now, to create class arrays you do:

LabRat.class_eval("@@#{name} = []; @@#{name} << 1")

Class arrays are retrievable with:

caarr = LabRat.class_variables
LabRat.class_eval("#{caarr.sample}[0]")

Now the results:

user      system    total     real
create 5000 class arrays    0.210000  0.000000  0.210000  (0.215799)
                            user      system    total     real
retrieve 5000 class arrays  0.050000  0.000000  0.050000  (0.048175)

With these results, we get a consistent pattern. This is good, as it shows stability in the language. As expected, class arrays perform about the same as instance arrays.

Hashes

The benchmark code for hashes is very similar to arrays so I wont’t be showing it here.

The results:

user      system    total     real
create 5000 local hashes       0.390000  0.000000  0.390000  (0.384721)
                               user      system    total     real
retrieve 5000 local hashes     0.550000  0.010000  0.560000  (0.558132)
                               user      system    total     real
create 5000 instance hashes    0.260000  0.000000  0.260000  (0.253513)
                               user      system    total     real
retrieve 5000 instance hashes  0.050000  0.000000  0.050000  (0.048339)
                               user      system    total     real
create 5000 class hashes       0.230000  0.000000  0.230000  (0.236260)
                               user      system    total     real
retrieve 5000 class hashes     0.060000  0.000000  0.060000  (0.052424)

These results tell me the same story. In Ruby, you will get a performance boost when you use instance and class variables instead of local variables.

Conclusion

Ruby performance has improved much over the years. One of the most remarkable improvements in Ruby MRI 2.0 is the bitmap garbage collector.

With all these results out of the way, keep in mind app performance is more art than science. I once heard of a project where customers complained the app was “slow”. All performance testing showed it could handle the load. Turns out, it was the confusing UI and workflow that made things slow for folks. In my experience, sound OOP design principles leads to UIs that are intuitive. The relationships between between objects become intelligible. This makes the application “fast” from the customer’s perception. Ruby, as shown, is the perfect candidate for this type of design.

If interested, you may download the code on GitHub.

Happy Hacking!