Ruby: Procs, Lambdas and Bindings

Ruby: Procs, Lambdas and Bindings

Procs

Yielding to blocks from methods is the simplest way to access closures. However, yield is a bit limited, because it can only invoke the block directly. It can’t send the block somewhere else to be invoked. To do that, you need a Proc object.

The Proc object wraps a block in a first-class function context. First class functions can be:

  • Passed as an argument
  • Returned as a value
  • Assigned to another variable

The Proc object exposes a call method that invokes its block.

To create a Proc object, you can use either of these two syntax variations:

my_proc = Proc.new do
  puts "Hello, I'm a block.
end

my_proc = proc { puts "Hello, I'm a block." }        

Convention is to use proc (and {}) with single-line blocks, and Proc.new (and do end) with multi-line blocks.

Here’s a bit of code that will show that procs are closures and that we can pass them as an argument:

1  def test_proc(a_proc)
2    if defined?(message).nil?
3      puts "Nope, can't see message variable from here."
4    end
5
6    a_proc.call # Proc has access to message variable, even from inside method.
7  end
8
9  message = 'This is a message from your friendly neighborhood Spiderman.'
10 msg_writer = Proc.new { puts message } # message variable is in scope for this proc.
11
12 test_proc msg_writer
13 #=> Nope, can't see message variable from here.
14 #=> This is a message from your friendly neighborhood Spiderman.        

Line 7 invokes the Proc object's call method. This method invokes the proc’s block, which executes the puts message command. Although the test method cannot directly see the message variable, the Proc object assigned to the a_proc parameter can. This is because blocks are closures, and the message variable is in the block’s lexical scope. The lexical context is the context from the written standpoint. Since we initialize message on line 9, the block on line 10 can see it. Since the block as written can see message, message is in the block’s lexical scope.

We make this distinction because we actually invoke the block on line 6, where the interior of the test_proc method’s scope is in force. This is not the same scope as the main area, so when we say that the block carries with it its lexical scope, we are saying that the block has the scope of where we create it rather than the scope of where we invoke it.

Lambdas

A lambda is a specific type of Proc object. To create a lambda, you can use either of two syntax variations:

my_proc = lambda do
  puts "Hello, I'm a block."
end

my_proc = -> { puts "Hello, I'm a block." }        

Convention is to use -> (and {}) with single-line blocks, and lambda (and do end) with multi-line blocks.

This code shows that lambdas are a specific type of Proc object:

my_proc = Proc.new { puts "Hello, I'm a block." }
puts my_proc

my_proc = -> { puts "Hello, I'm a block." }
puts my_proc 

#=> <Proc:0x000000010902f0a8 test.rb:1>
#=> <Proc:0x000000010902eb80 test.rb:4 (lambda)>        

Both the proc and the lambda are Proc object instances. The only difference is that in the lambda, the lambda flag is set. The Proc object exposes a lambda? method that returns true if the Proc instance is a lambda.

Lambdas behave differently from regular Proc objects in two ways:

  1. Doing a return from a block wrapped in a proc returns in the context in which the proc was created. Doing a return from a block wrapped in a lambda returns to the context in which the lambda was called.
  2. Procs don’t check whether the call method passes the right number of arguments to the block. Any missing arguments are assigned the value of nil, and any extra arguments are ignored. Lambdas throw an ArgumentError if call passes the wrong number of arguments.

Let's look at both of these differences in more detail.

Difference One

Here’s how the Proc object behaves when it encounters a return:

1  def test_proc(a_proc)
2    a_proc.call
3    puts 'Ok, proc is all done.'
4  end
5
6  message = 'This is a message from your friendly neighborhood Spiderman.'
7
8  msg_writer = Proc.new do
9  puts message
10   return
11 end
12
13 test_proc(msg_writer)
14 puts 'Goodbye!'
15
16 #=> This is a message from your friendly neighborhood Spiderman.         

The return on line 10 returns out of the entire program, so lines 3 and 14 never run.

Now, let’s look at how the same code behaves when written as a lambda:

1  def test_proc(a_proc)
2    a_proc.call
3    puts 'Ok, proc is all done.'
4  end
5
6  message = 'This is a message from your friendly neighborhood Spiderman.'
7
8  msg_writer = lambda do
9  puts message
10   return
11 end
12
13 test_proc(msg_writer)
14 puts 'Goodbye!'
15
16 #=> This is a message from your friendly neighborhood Spiderman.
17 #=> Ok, proc is all done.
18 #=> Goodbye!        

With a lambda, the return on line 10 returns to line 3, the line after the lambda invokes the call method, and execution continues from there. Therefore, lines 3 and 14 run.

Difference Two

Consider this code:

msg_writer = Proc.new do |name, msg|
  p "Hello, my name is #{name}."
  p msg
end

msg_writer.call 'Blockhead'
msg_writer.call 'Blockhead', 'I am a block.', 'I hope you like blocks!'
#=>"Hello, my name is Blockhead."
#=>nil
#=>"Hello, my name is Blockhead."
#=>"I am a block."

msg_writer = lambda do |name, msg|
  p "Hello, my name is #{name}."
  p msg
end

msg_writer.call 'Blockhead'
#=>wrong number of arguments (given 1, expected 2) (ArgumentError)        

From this, you can see that the Proc objects created with Proc::new assign nil to missing arguments and silently ignore extra ones, while objects created with Kernel::lambda throw an ArgumentError when invoking call with the wrong number of arguments.

Bindings

Consider this code:

def test_proc(a_proc)
  a_proc.call
end

message = 'You can see me!'
msg_writer = Proc.new { puts message }
message = 'No, you really can see me!'

test_proc(msg_writer) #=> No, you really can see me!        

Notice that the closure msg_writer, even after it has been created, keeps track of the change to the enclosed variable message.

How? Well, in Ruby, everything is an object. Perhaps we can just keep track of a reference to the message object? Let’s see:

1  def test_proc(a_proc)
2    a_proc.call
3  end
4
5  message = 'You can see me!'
6  puts message.object_id
7
8  msg_writer = Proc.new { puts 'Inside block: ' + message.object_id }
9
10 message = 'No, you really can see me!'
11 puts message.object_id
12
13 #=> 60
14 #=> 80
15 #=> Inside block: 80        

Well, no we can’t. In Ruby, everything is an object, but also in Ruby, every time you reassign a variable, Ruby creates a new object. That’s why lines 6 and 11 print different object ids: when we reassign the message variable on line 10 we also create a new object.

So, there has to be another mechanism to create closures. That mechanism is a binding.

Binding Internals

At a low level, a binding is an object that wraps a stack frame.

A frame is a C struct that contains the state of a context. Each time a program invokes a method, Ruby pushes a frame on the call stack, often called just the stack. When the method execution is complete, Ruby pops its frame off of the call stack.

For example, in the last code example, we have the state that is visible to test_proc, and we have the state that is visible to main. These two sets of state are in two different frames, both of which reside on the call stack.

We can't use this mechanism to implement closures. One frame on the stack can’t see another one, and the closure’s state has to persist beyond the context of the frame in which it’s created. What will work is storing the frame's state in an object on the heap. An object has a life of its own, so it will persist even when the lifetime of the context that spawned the frame ends.

The object in which we store a closure's state is a binding. This is why all blocks have a Binding object associated with them.

The Binding Class

The Binding class defines the binding. Whenever a block is created, an instance of the Binding class (called, helpfully, binding) is created to go along with it. This binding object keeps track of any changes to the state of the closure. As the doc says, “Objects of class Binding encapsulate the execution context at some particular place in the code and retain this context for future use.” When there is a request for any value contained in a block’s closure, the request is passed to the binding object instance, which locates and accesses the value.

We can use the binding object to get a bit of insight into how binding works.

x = 'You can see me!'
a_proc = Proc.new { puts x }

puts "Local variables: #{a_proc.binding.local_variables}"
puts "Value of x: #{a_proc.binding.local_variable_get(:x)}"
puts

x = 'No, you really can see me!'
puts "Changed value of x: #{a_proc.binding.local_variable_get(:x)}"

#=> Local variables: [:x, :a_proc]
#=> Value of x: You can see me!
#=> 
#=> Changed value of x: No, you really can see me!        

We can see from this code that the Binding class exposes some methods (we're using local_variables and local_variable_get) that access the local variables. The binding object stores these variables as key/value pairs. The keys are the names of the variables, stored as symbols with the same name as the variables (including — as always — a reference to self, which in this case is :a_proc).

Related Articles

This article is one of a series of four. Here are the other three:

Ruby: Blocks

Ruby: Scope and Closures

Ruby: Block Parameters and Return Values

要查看或添加评论,请登录

Robert Rodes的更多文章

社区洞察

其他会员也浏览了