Given the following two pieces of code:
def hello(z)
"hello".gsub(/(o)/, &z)
end
z = proc {|m| p $1}
hello(z)
# prints: nil
def hello
z = proc {|m| p $1}
"hello".gsub(/(o)/, &z)
end
hello
# prints: "o"
Why are the outputs of these two pieces of code different? Is there a way to pass a block to gsub
from outside of the method definition so that the variables $1
, $2
would be evaluated in the same way as if the block was given inside the method definition?
Why the output is different?
A proc in ruby has lexical scope. This means that when it finds a variable that is not defined, it is resolved within the context the proc was defined, not called. This explains the behavior of your code.
You can see the block is defined before the regexp, and this can cause confusion. The problem involves a magic ruby variable, and it works quite differently than other variables. Citing @JörgWMittag
It's rather simple, really: the reason why $SAFE doesn't behave like you would expect from a global variable is because it isn't a global variable. It's a magic unicorn thingamajiggy.
There are quite a few of those magic unicorn thingamajiggies in Ruby, and they are unfortunately not very well documented (not at all documented, in fact), as the developers of the alternative Ruby implementations found out the hard way. These thingamajiggies all behave differently and (seemingly) inconsistently, and pretty much the only two things they have in common is that they look like global variables but don't behave like them.
Some have local scope. Some have thread-local scope. Some magically change without anyone ever assigning to them. Some have magic meaning for the interpreter and change how the language behaves. Some have other weird semantics attached to them.
If you are really up to find exactly how the $1
and $2
variables work, I assume the only "documentation" you will find is rubyspec, that is a spec for ruby done the hard way by the Rubinus folks. Have a nice hacking, but be prepared for the pain.
Is there a way to pass a block to gsub from another context with $1, $2 variables setup the right way?
You can achieve what you want with this following modification (but I bet you already know that)
require 'pp'
def hello(z)
#z = proc {|m| pp $1}
"hello".gsub(/(o)/, &z)
end
z = proc {|m| pp m}
hello(z)
I'm not aware of a way to change the scope of a proc on the fly. But would you really want to do this?
Things like $1
, $2
acts like LOCAL VARIABLES, despite its leading $
. You can try the code below to prove this:
def foo
/(hell)o/ =~ 'hello'
$1
end
def bar
$1
end
foo #=> "hell"
bar #=> nil
Your problem is because the proc z
is defined outside the method hello
, so z
accesses the $1
in the context of main
, but gsub
sets the $1
in the context of method hello
.
The two versions are different because the $1
variable is thread-local and method-local. In the first example, $1
only exists in the block outside the hello
method. In the second example, $1
exists inside the hello
method.
There is no way to pass $1 in a block to gsub from outside of the method definition.
Note that gsub
passes the match string into the block, so z = proc { |m| pp m }
will only work as long as your regular expression only contains the whole match. As soon as your regular expression contains anything other than the reference you want, you're out of luck.
For example, "hello".gsub(/l(o)/) { |m| m }
=> hello
, because the whole match string was passed to the block.
Whereas, "hello".gsub(/l(o)/) { |m| $1 }
=> helo
, because the l
that was matched is discarded by the block, all we are interested in is the captured o
.
My solution is to match
the regular expression, then pass the MatchData
object into the block:
require 'pp'
def hello(z)
string = "hello"
regex = /(o)/
m = string.match(regex)
string.gsub(regex, z.call(m))
end
z = proc { |m| pp m[1] }
pp hello(z)
Here is a workaround (Ruby 2). The given Proc z
behaves exactly as the block given to String#gsub
.
def hello(z)
"hello".match /(o)/ # Sets $1, $2, ...
z.binding.tap do |b|
b.local_variable_set(:_, $~)
b.eval("$~=_")
end
"hello".gsub(/(o)/, &z)
end
z = proc {|m| p $1}
hello(z)
# prints: "o"
The background is explained in detail in
this answer
to the question "How to pass Regexp.last_match to a block in Ruby" (posted in 2018).