Ruby & the Multicore CPU: Part Two - Ruby up to 2.7
MRI Ruby
When ruby was conceived, multi-core CPUs weren’t at the forefront of a language designer’s mind. Subsequently, the ruby virtual machine was developed as a single process, single threaded executable.
As such, ruby supplies two options to support concurrent and parallel processing.
Create a new ruby process. From within the runtime, the fork
method,
allows you to spin off a child process. This will duplicate the memory of the
original process. In terms of threading complexity, the child process can then
be treated as an independant process, and you won’t have to worry about
mutexes, locks, shared memory, or any of the associated headaches.
Ruby also shipped with the plain old Thread
class, which allowed you to
use the standard single-threaded approach with locks and all the associated
complications and problems.
Managing threads, locks and shared memory is notoriously difficult to do, and even harder to debug. Since multi-threading wasn’t originally a huge concern, and much of the standard library and c extensions could not be guaranteed to be thread safe, in order to make the whole endeavour a little safer, the Global Interpreter Lock (GIL)1 was created. This ensures that only one thread is ever run at a time, and that shared memory is not so easily corrupted by competing threads. Even when MRI moved from green threads to native threads in version 1.9, the GIL remained to prohibit any parallel processing of threads. This restriction is finally set to change in ruby 3.0. We’ll get to that later in the series.
JRuby
Meanwhile, in the wider ruby universe, JRuby was born. This took advantage of the Java Virtual Machine, and the broad Java ecosystem. Although spinning up a new process was even slower, once warmed up and making best use of the latest JVM and JIT compiler, long running ruby applications were often much faster. More relevant to our topic, however, is that JRuby also made use of java’s threading model. This meant that the GIL was eliminated, and threads could run in parallel.
JRuby has also implemented a few thread-safe versions of parts of the standard library, and many other thread-safe implementations of useful libraries are available due to JRuby being implemented on top of the JVM.
However, slight differences in the implementations of MRI ruby and JRuby have meant that code written for one might not always work on the other. For example, JRuby does not support forking.
For more detail, the JRuby wiki has further discussion.
In general, the safest path to writing concurrent code in JRuby is the same as on any other platform:
- Don’t do it, if you can avoid it.
- If you must do it, don’t share data across threads.
- If you must share data across threads, don’t share mutable data.
- If you must share mutable data across threads, synchronize access to that data.
(from the JRuby wiki)
Rubinius
There soon followed another version of ruby, Rubinius. This was ruby written in ruby (plus a little bit of C).
Rubinius also has a slightly different approach to its threading model and removed the GIL, allowing for multi-threaded execution.
I’m not sure whatever happened to rubinius. The homepage is completely inscrutable now. It seems to still be alive. Let me know in the comments if you know what’s going on.
TruffleRuby
TruffleRuby is an in-development version of ruby supported by Oracle, which runs on the GraalVM. In supported benchmarks, it can be significantly faster than MRI ruby, and again supports a parallel threading model. TruffleRuby fibers are implemented as native threads. Therefore, they will have slightly different performance characteristics to MRI fibers.
- Introduction
- Concurrency vs. Parallelism
- Processes
- Threads
- Fibers
- Synchronicity
- Ruby up to 2.7
- MRI Ruby
- JRuby
- Rubinius
- TruffleRuby
- Current Concurrency Paradigms
- Queues and Jobs
- Communicating Sequential Processes
- Actor Model
- Reactor Model
- Ruby 3.0 Concurrency and Parallelism
-
Sometimes also called the Giant VM Lock, or GVL. ↩