Benchmarking doesn't hurt

31 March, 2015

While working on a problem you probably choose between several ways of implementing a solution. How do you decide which implementation to go for? Do you base this on style, lines of code or something else?

As you’re considering your options please also consider performance. Maintainability and readability are very important, but when your project is serving many visitors a minute you’ll want to speed things up.

So what can we do? One of the things we can do is benchmark our options.

Benchmarking sounds like a lot of work and headaches, but it doesn’t have to be. If done right it doesn’t take too much of your time and you can quickly run the next benchmark.

Ruby benchmark

Ruby ships with its own benchmarking library which is quite useful if you quickly want to compare two implementations. You specify two or more pieces of code, an iterations count and run it.

# benchmark.rb
require 'benchmark'

n = 10_000_000
Benchmark.bmbm do |x| %(string = '') do
    n.times do
      foo = 'bar'
      foo = ''
  end 'string.clear' do
    n.times do
      foo = 'bar'

The benchmark will give you a report with how long everything took to complete. In this case the lower number is better.

$ ruby ./benchmark.rb
Rehearsal ------------------------------------------------
string = ''    3.150000   0.570000   3.720000 (  3.790554)
string.clear   2.460000   0.060000   2.520000 (  2.534461)
--------------------------------------- total: 6.240000sec

                   user     system      total        real
string = ''    2.910000   1.130000   4.040000 (  4.084821)
string.clear   2.510000   0.070000   2.580000 (  2.609390)

Using the iterations count you can quickly tests how well your code performs over time and when dealing with more objects. This is important because how many times code is called and how many objects you give it can greatly affect the outcome of your benchmarks.

Some tweaking of variables and Garbage Collection is involved here. Reading the documentation is a must!

benchmark-ips gem

There’s also benchmark-ips, a gem which allows you to benchmark iterations per second. This is probably the easiest method for most tests. It’s not the perfect benchmarking solution but it does take away a lot of the benchmarking headaches.

require 'benchmark/ips'

ARRAY = (1..100).to_a

Benchmark.ips do |x|'Array#each + push') do
    array = []
    ARRAY.each { |i| array.push i }
  end'Array#map') do { |i| i }
  end! # Print the comparison

source: JuanitoFatas/fast-ruby

In this report the conclusion is already printed out for you making it even easier to see what performs beter.

Calculating -------------------------------------
   Array#each + push     9.025k i/100ms
           Array#map    13.947k i/100ms
   Array#each + push     99.634k (± 3.2%) i/s -    505.400k
           Array#map    158.091k (± 4.2%) i/s -    794.979k

           Array#map:   158090.9 i/s
   Array#each + push:    99634.2 i/s - 1.59x slower

What should we benchmark?

The standard benchmarking library in Ruby and the benchmark-ips gem make it really easy to set up a benchmark in minutes. There really is no excuse not to benchmark your code when you have doubts about its speed.

There is also a lot of benchmarking already done for Ruby. There is a very good repository fast-ruby by Juanito Fatas based on Erik Michaels-Ober’s Baruco 2014 presentation. This repository contains a lists of comparisons between standard Ruby implementations and the benchmark code.

What should we benchmark then?

Unfortunately most implementation decisions aren’t a matter of Array#each + push vs Array#map. The fast-ruby repository is set up to test code that perform the same kind of logic, not solve a specific problem.

There’s also code involved from other people by using frameworks and libraries. We don’t call one method to solve the entire problem, but the program performs multiple operations on one or more objects. All those operations combined determine the actual speed of the program.

Go back to one of your recent projects, find some complex logic and consider alternative implementations. You might be surprised how much faster another solution might be.

Just an example

A problem in a recent project at work was to wrap each not empty line in a string with span elements. In this case we needed all the speed we could get, because this operation was performed for a lot of translatable records from our database per request.

There are quite a lot of implementations to accomplish this, but I’ll limit the solutions in the example below.

require "benchmark/ips"

STRING = "\nLorem ipsum\ntempor invidunt\n\naliquyam erat\nvero eos\nno sea. Lorem\n\nipsum\n"

# Desired end result:
# <span>Lorem ipsum</span>
# <span>tempor invidunt</span>
# <span>aliquyam erat</span>
# <span>vero eos</span>
# <span>no sea. Lorem</span>
# <span>ipsum</span>

Benchmark.ips do |x|"lines + map") do
    STRING.squeeze("\n").strip!! { |line| "<span>#{line}</span>" }.join
  end"each_line + inject") do
    STRING.squeeze("\n").strip!.each_line.inject("") { |s, line| s << "<span>#{line}</span>" }
  end"map with if") do! { |line| "<span>#{line}</span>" unless line.strip!.empty? }.join
  end"reject + map") do
    STRING.lines.reject { |line| line.strip!.empty? }.map! { |line| "<span>#{line}</span>" }.join
  end"gsub") do
    STRING.gsub(/^(.+)$/) { |line| "<span>#{line}</span>" }
  end! # Output the comparison
Calculating -------------------------------------
         lines + map    11.141k i/100ms
  each_line + inject     9.296k i/100ms
         map with if     9.017k i/100ms
        reject + map    10.100k i/100ms
                gsub     7.146k i/100ms
         lines + map    134.936k (± 3.6%) i/s -    679.601k
  each_line + inject    108.385k (± 3.9%) i/s -    548.464k
         map with if    108.906k (± 3.2%) i/s -    550.037k
        reject + map    120.025k (± 3.3%) i/s -    606.000k
                gsub     83.394k (± 3.5%) i/s -    421.614k

         lines + map:   134935.6 i/s
        reject + map:   120024.8 i/s - 1.12x slower
         map with if:   108905.5 i/s - 1.24x slower
  each_line + inject:   108384.8 i/s - 1.24x slower
                gsub:    83393.9 i/s - 1.62x slower

You see that a combination of calls on different methods and objects can make quite the bit of difference.

I had actually hoped the gsub would have been the fastest, because it performs the least amount of operations on the surface. It wasn’t really really an option as it becomes even slower with larger strings. (Some of the other solutions however, create more objects and take up more memory.)


The next time you’re considering your options, set up a quick benchmark and see what’s faster. Only add the code you need to solve the problem. Avoid database queries and large libraries, unless that’s what you’re benchmarking.

However, benchmarking does not always give you the complete picture about why your code might be slow, but it does give you a good indication about how your code is performing.

For more details on how to setup benchmarks in Ruby, take a good look at the Ruby docs and the benchmark-ips README.

So benchmark your code and hopefully your applications will respond faster next time.