Returning data from forked processes

2019-01-17 07:07发布

问题:

If I do

Process.fork do 
  x 
end 

how can I know what x returned (e.g. true/fase/string) ?

(Writing to a file/database is not an option...)

回答1:

We actually just had to handle this problem in Rails isolation testing. I posted about it some on my blog.

Basically, what you want to do is open a pipe in the parent and child, and have the child write to the pipe. Here's a simple way to run the contents of a block in a child process and get back the result:

def do_in_child
  read, write = IO.pipe

  pid = fork do
    read.close
    result = yield
    Marshal.dump(result, write)
    exit!(0) # skips exit handlers.
  end

  write.close
  result = read.read
  Process.wait(pid)
  raise "child failed" if result.empty?
  Marshal.load(result)
end

Then you could run:

do_in_child do
  require "some_polluting_library"
  SomePollutingLibrary.some_operation
end

Note that if you do a require in the child, you will not have access to that library in the parent, so you cannot return an object of that type using this method. However, you could return any type that's available in both.

Also note that a lot of the details here (read.close, Process.wait2(pid)) are mostly housekeeping details, so if you use this a lot you should probably move this out into a utility library that you can reuse.

Finally, note that this will not work on Windows or JRuby, since they don't support forking.



回答2:

Thanks for all the answers, I got my solution up and running, still need to see how to handle non-forking environments, but for now it works :)

read, write = IO.pipe
Process.fork do
  write.puts "test"
end
Process.fork do
  write.puts 'test 2'
end

Process.wait
Process.wait

write.close
puts read.read
read.close

you can see it in action @ parallel_specs Rails plugin



回答3:

I wrapped all the solutions I found along the way (some other problems like user exiting + piping-buffers) into ruby parallel gem. Now it is as easy as:

results = Parallel.map([1,2,3],:in_processes=>4) do |i|
  execute_something(i)
end

or

results = Parallel.map([1,2,3],:in_threads=>4) do |i|
  execute_something(i)
end


回答4:

Yes, you can create a subprocess to execute a block inside.

I recommend the aw gem:

Aw.fork! { 6 * 7 } # => 42

Of course, it prevents from side effects:

arr = ['foo']
Aw.fork! { arr << 'FUU' } # => ["foo", "FUU"]
arr # => ["foo"]


回答5:

According to the documentation:

If a block is specified, that block is run in the subprocess, and the subprocess terminates with a status of zero.

So if you call it with a block, it returns 0. Otherwise, it functions basically the same as the fork() system call on Unix (the parent receives the PID of the new process, the child receives nil).



回答6:

The fork communication between two Unix processes is mainly the return code and nothing more. However, you could open a filedescriptor between the two processes and pass data between the processes over this filedescriptor: this is the normal Unix pipe way.

If you would pass Marshal.dump() and Marshal.load() values, you could easily pass Ruby objects between those Ruby processes.



回答7:

You can use shared memory to do this if the child just needs to be a small chunk of ruby code. Something like the following will work:

str = 'from parent'

Thread.new do
  str = 'from child'
end

sleep(1)

puts str    # outputs "from child"

Concurrency can be pretty tricky, though, and accessing shared memory this way is a big part of the reason - any time you've got a variable and another process might change it out from under you, you should be very wary. Alternatively, you can use a pipe, which is more cumbersome but probably safer for any but the most trivial code, and can also be used to run any arbitrary command. Here's an example, straight out of the rdoc for IO.popen:

f = IO.popen("uname")
p f.readlines     # outputs "Darwin", at least on my box  :-)