How are values actually retrieved from an Enumerator object in Ruby?
I am wondering how values are retrieved from an object Enumerator
. In the following code snippet, I was expecting the first call enum.next
to throw an exception, since all values have already been retrieved from enum
after the call enum.to_a
.
enum = Enumerator.new do |yielder|
yielder.yield 1
yielder.yield 2
yielder.yield 3
end
p enum.to_a # => [1, 2, 3]
puts enum.next # Expected StopIteration here
puts enum.next
puts enum.next
puts enum.next # => StopIteration exception raised
What is the difference between a call next
and a type iterator method to_a
on an instance Enumerator
?
source to share
Short answer: to_a
always iterates over all elements and does not advance the position of the iterator. This is why Enumerator # next will start from the first element, even if you've already called to_a
. The call to_a
does not change the enumeration object.
Here are the details:
Conditions: inner and outer iteration
When discussing iterators in Ruby, two terms come up:
- inner iteration (also called implicit iteration)
- external iteration
Your question enum.to_a
is an example of use enum
for inner iteration and enum.next
is an example of outer iteration.
External iteration provides more control but is more low-level. Internal iteration is often more elegant. The difference is that the outer iteration makes the state explicit (current position), while the inner iteration is implicitly applied to all elements.
Internal iteration: to_a
to_a
will call Enumerator # each , which iterates over the block according to how this Enumerator was created .
This is a critical point. Since it does not work in the internal state (position) of the enumerator object from which it is called, it does not interfere with calls next
(external iteration operation).
External iteration: next
When you create an Enumerator object, its state is initialized to point to the first object. You can change the internal state by calling next
that will advance the position. After all the elements are consumed, it raises an exception StopIteration
.
Note that state is only meaningful if you are using an enumerator object for external iteration. This explains why you can safely call to_a
on a counter that has already consumed all elements and it will still return a list of all elements. All the internal operations of the iteration (e.g., each
, to_a,
map`) do not interfere with the outer iteration.
Implementation in Rubinius
I have looked at the source code of Rubinius to see how it is implemented there. While this is not a language specification, it should be relatively close to the truth. Entry points:
- Enumerable # to_a (arg will be empty)
- Enumerator # next
- Enumerator # each
Note that Enumerator includes Enumerable as a mix.
source to share
The call #next
moves the inner position forward rather #to_a
than considering the inner position at all. Try calling next
once, then to_a
, then next
again to experiment.
https://ruby-doc.org/core-2.4.0/Enumerator.html#method-i-next
https://ruby-doc.org/core-2.4.0/Enumerable.html#method-i-to_a
source to share