Python iterators. Initialize state variables in __init__ or __iter__?
I'm new to Python and just looking at some examples of defining iterator objects.
The example I was looking at was:
class fibit: # iterate through fibonacci sequence from 0,1...n<=max
def __init__(self, max):
self.max = max
def __iter__(self):
self.a = 0
self.b = 1
return self
def next(self):
fib = self.a
if fib > self.max:
raise StopIteration
self.a, self.b = self.b, self.a + self.b
return fib
But if I swap the initialization of self.a and self.b from iter to init , it seems (in my simple understanding) to work exactly the same.
class fibit: # iterate through fibonacci sequence from 0,1...n<=max
def __init__(self, max):
self.a = 0
self.b = 1
self.max = max
def __iter__(self):
return self
def next(self):
fib = self.a
if fib > self.max:
raise StopIteration
self.a, self.b = self.b, self.a + self.b
return fib
So which is the recommended way to do this?
Thank.:)
source to share
Initialization must be done in __init__
. That's why he's there.
Iterator objects in Python canonically "use one time" - after you've iterated over the iterator, it didn't expect you to be able to iterate over it.
So it doesn't make sense to reinitialize the values if you try to iterate over the object again. To illustrate this, I've expanded your code a bit:
class fibit_iter: # iterate through fibonacci sequence from 0,1...n<=max
def __init__(self, max):
self.max = max
def __iter__(self):
self.a = 0
self.b = 1
return self
def next(self):
fib = self.a
if fib > self.max:
raise StopIteration
self.a, self.b = self.b, self.a + self.b
return fib
class fibit_init: # iterate through fibonacci sequence from 0,1...n<=max
def __init__(self, max):
self.a = 0
self.b = 1
self.max = max
def __iter__(self):
return self
def next(self):
fib = self.a
if fib > self.max:
raise StopIteration
self.a, self.b = self.b, self.a + self.b
return fib
iter_iter = fibit_iter(10)
iter_init = fibit_init(10)
print "iter_iter"
for item in iter_iter:
print item
break
for item in iter_iter:
print item
break
print "iter_init"
for item in iter_init:
print item
break
for item in iter_init:
print item
break
Essentially I create an object from your init version and an object from your iter version. Then I try to partially iterate over both of them twice. Notice how you get different results:
iter_iter
0
0
iter_init
0
1
source to share
__init__
used when initializing an instance of a class. It is used to set the classinstance attributes.
__iter__
used to define behavior for a class that is iterable (returns an intern).
@mgilson - __iter__
usually returns an Iterator, which differs from a list (type () is physically different) in that iterators yield (destructively consume) values.
Try to call __iter__()
xrange instance method and try to loop through multiple times.
>>> foo = xrange(5)
>>> bar = a.__iter__()
>>> bar.next()
0
>>> bar.next()
1
>>> list(bar)
[2, 3, 4]
source to share
It really depends on what you want your iterator to do. If you have good reason to iterate through your iterator multiple times, each time getting the same result, then initialize it to __iter__
; however, this should be the exception, not the norm; since iterators with good performance should continue raise StopIteration
when exhausted rather than restarting the sequence. In fact: an iterator that returns more values after the creation StopIteration
, is considered to be broken (although this is more a warning to be careful how you use it).
So, I repeat:
-
for a standard iterator, put your initialization in
__init__
-
for custom / reusable behavior, put your initialization in
__iter__
source to share