In Ruby, this code is not threadsafe if array
is modified by many threads:
array = []
array << :foo # many threads can run this code
Why is the <<
operation not thread safe?
In Ruby, this code is not threadsafe if array
is modified by many threads:
array = []
array << :foo # many threads can run this code
Why is the <<
operation not thread safe?
array
is your program variable when you apply an operation like <<
to it. It happens in three-steps:
So this high-level single-operation is performed in three steps. In between these steps, due to thread-context switching, other thread may read the same (old) value of the variable. That's why it's not an atomic operation.
If you have multiple threads accessing the same array, use Ruby's built-in Queue class. It nicely handles producers and consumers.
This is the example from the documentation:
require 'thread'
queue = Queue.new
producer = Thread.new do
5.times do |i|
sleep rand(i) # simulate expense
queue << i
puts "#{i} produced"
end
end
consumer = Thread.new do
5.times do |i|
value = queue.pop
sleep rand(i/2) # simulate expense
puts "consumed #{value}"
end
end
consumer.join
Actually using MRI (Matz's Ruby implementation) the GIL (Global Interpreter Lock) makes any pure C-function atomic.
Since Array#<<
is implemented as pure C-code in MRI, this operation will be atomic. But note this only applies to MRI. On JRuby this is not the case.
To completely understand what is going on I suggest you read these two articles, which explains everything very well:
Nobody Understands the GIL
Nobody Understands the GIL - part 2
This link might be helpful for you:
http://www.jstorimer.com/pages/ruby-core-classes-arent-thread-safe
Also you might be interested in this gem:
https://rubygems.org/gems/thread_safe
Because Ruby is a very high level language, nothing is really atomic at the OS level. Only very simple assembly operations are atomic at the OS level (OS dependant), and every Ruby operation, even a simple 1 + 1
corresponds to hundreds or thousands of assembly instructions executed, such as method lookups, garbage collection, object initialization, scope calculations, etc.
If you need to make operations atomic, use Mutexes.
Just riffing off of @Linuxios and @TheTinMan: high-level language (HLL) operations in general are not atomic. Atomicity is (generally) not an issue in single-threaded programs. In multi-threaded programs, you (the programmer) have to reason about it at a much higher granularity than a single HLL operation, so having individual HLL operations that are atomic doesn't actually help you that much. On the flip side, although making an HLL operation atomic takes only a few machine instructions before and after—at least on modern hardware—the static (binary size) and dynamic (execution time) overheads add up. Even worse, explicit atomicity pretty much disables all optimization because compilers cannot move instructions across atomic operations. No real benefit + significant cost = non-starter.