BS: Giving Rubinius a Try

Giving Rubinius a Try

Contrary to some popular perceptions, "Ruby" is not actually slow. Ruby is a programming language, and a language is not in and of itself fast or slow. It is the language's implementation that is fast or slow, relative to other language implementations.

The best-known Ruby implementation is probably the implementation generally known as MRI, or Matz' Ruby Interpreter -- where "Matz" is Yukihiro Matsumoto, the creator of the Ruby language. MRI is the reference implementation for Ruby up through version 1.8.x, which means that it is the "standard" by which the Ruby compatibility of other implementations is judged. MRI is, indeed, quite slow for a lot of purposes. Version 1.9+, on the other hand, uses an implementation called YARV (or Yet Another Ruby VM), also known as KRI (or Koichi's Ruby Interpreter). YARV offers significant performance improvement, placing it on comparable footing with the reference implementations of Perl, PHP, and Python. Both of these implementations are distributed under a dual-license model -- the GPL and the Ruby License (which is unfortunately about as bad as the GPL).

Other implementations exist as well. There is one for the Java VM called JRuby, for instance -- also copyleft licensed. Ironically, an implementation for the .NET Framework called IronRuby is distributed under a copyfree license, the MIT/X11 License. So far, what we have is three copyleft implementations and one copyfree implementation that only runs in a Microsoft-designed environment. There is, however, another implementation that I have started using recently: Rubinius. (There's MacRuby too, but I frankly do not know anything much about it other than that it is a Ruby 1.9 implementation for MacOS X distributed under the Ruby license.)

Rubinius is in FreeBSD ports, which made it easy for me to install on the laptop I use as my primary development environment. Rubinius is a copyfree (BSD License) implementation of Ruby, primarily written in Ruby -- plus a little C++. It is, according to the Rubinius Website, 93% compatible with RubySpec, an executable specification for the Ruby language. I have started using it with my own Ruby software projects and had 100% success so far, though. Unfortunately, I do not presently have the option of running blogstrapping on Rubinius, because my Webhost does not support it.

Rubinius currently targets Ruby 1.8.x compatibility, though real progress is apparently being made on Ruby 1.9/2.0 compatibility for an upcoming Rubinius version 2. In addition to being distributed under a better license than MRI and YARV, it is also pretty fast. It is, at least, notably faster than MRI. It is reported to be faster for execution of plain ol' Ruby code than YARV, which is in turn about twice as fast as MRI in general, though the fine folks in #rubinius on freenode tell me it is a bit slower for things like array and hash operations.

One performance issue I have noticed is startup time. Obviously, this will vary depending on the computer hardware and other operating environment details. I have seen references to 0.3 seconds as "normal" for startup time for some users; for me (on my ThinkPad T60), it tends to vary between half a second and one second using the Unix time utility, though I have seen it squeak in under half a second at 0.45 seconds.

As @evan put it in #rubinius, the simple use case for the Ruby Benchmark module is easy:

puts Benchmark.measure { your code }

In the midst of trying to get some benchmarks with that, I discovered two things:

I stumbled across a parsing error in MRI. I thought at first that I was misusing the Benchmark module somehow, but after some discussion with people in #rubinius, they eventually managed to confirm that it is indeed a parsing bug. The same bug appears to exist in Rubinius and other 1.8 parsers.
I stumbled across a bug in dscribe, the program I was trying to benchmark with the Benchmark module. This bug does not affect my most-common use cases for dscribe, but I still need to fix it.

Running time rbx -v (checking the Rubinius version number) shows 0.45 seconds, while time ruby -v (checking the MRI version number) shows 0.04 seconds. That is some significant overhead for the Rubinius VM as compared with MRI. That overhead all seems to be startup time, something that I have been told has not been much of a priority so far as compared with execution time for Rubinius development. After talking to the people in #rubinius, though, they told me they would add it to the queue of things to work on, so VM startup time might improve.

Startup times like that result in noticeable hesitation when running simple command line utilities, but that in itself should not have any particularly noticeable effect on long-running processes. Maybe I'll have some meaningful benchmarks to share in the near future, after fixing my dscribe bug. In the meantime, I'm satisfied that my code is running "fast enough" for my purposes with Rubinius, so I'll be using that instead of MRI for a while.