Iterators¶
The Iteration Protocol¶
The for
statement or list comprehensions are usually used to iterate through
sequences:
l = [1, 2, 3, 4]
for i in l:
print i
But it is also possible to manually iterate through a sequence. The code above is equivalent to:
l = [1, 2, 3, 4]
it = iter(l)
while True:
try:
print next(it)
except StopIteration:
break
The function iter()
returns an iterator object. Calling next()
on
this object returns a new element from the sequence. When the sequence is
exhausted, next()
raises a exceptions.StopIteration
exception.
iter(a)
calls a.__iter__()
and next(a)
calls a.next()
:
l = [1, 2, 3, 4]
it = l.__iter__()
while True:
try:
print it.next()
except StopIteration:
break
To force an iterator to return the rest of the elements you can use
list()
:
l = [1, 2, 3, 4]
it = iter(l)
assert next(it) == 1
assert list(it) == [2, 3, 4]
The itertools
module contains some useful tools for creating and
manipulating iterators.
Iterator Classes¶
Let’s write an iterable class which returns the first end - 1 natural numbers:
class SimpleRange:
def __init__(self, end):
self.end = end
def __iter__(self):
return SimpleRangeIterator(self)
Now we can define the iterator:
class SimpleRangeIterator:
def __init__(self, sr):
self.sr = sr
self.current = 0
def next(self):
if self.current < self.sr.end:
value = self.current
self.current += 1
return value
else:
raise StopIteration
If we make the iterator itself iterable, we can combine the two classes:
class SimpleRange:
def __init__(self, end):
self.current = 0
self.end = end
def __iter__(self):
return self
def next(self):
if self.current < self.end:
value = self.current
self.current += 1
return value
else:
raise StopIteration
for i in SimpleRange(3):
print i
assert list(SimpleRange(10)) == range(10)
The __iter__() method must return self to make the iterator iterable.
Note that range()
returns a list, not an iterator.
Exercise¶
Write an infinite iterator (never raises exceptions.StopIteration
) for
Fibonacci numbers.
Unit test:
import itertools
def test_fibonacci():
assert list(itertools.islice(Fibonacci(), 13)) == \
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144]
test_fibonacci()
Generator Functions¶
There’s a more convenient way of writing iterators:
def gen_colors():
print 'starting gen_colors'
yield 'red'
yield 'green'
yield 'blue'
print 'gen_colors finished'
for color in gen_colors():
print color
Generator functions use yield
instead of return
. There are two
important differences between the two keywords:
yield
can be used multiple timesyield
suspends the execution of the function, allowing it to be resumed when the next element is requested.
Generator functions do not start running when they are called. Instead they return a generator object, which implements the iteration protocol. The body of the function starts executing when the first value is requested via the next() method:
g = gen_colors()
print 'got generator object', g
print next(g)
print next(g)
Exercise¶
Write a generator function for Fibonacci numbers.
Unit test:
import itertools
def test_fibonacci():
assert list(itertools.islice(fibonacci(), 13)) == \
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144]
test_fibonacci()
Generator Expressions¶
Generator expressions are like list comprehensions except they are wrapped in
()
instead of []
and return a generator object instead of a list:
print [x*x for x in range(10)]
print (x*x for x in range(10))
assert list(x*x for x in range(10)) == [x*x for x in range(10)]
for i in (x*x for x in range(10)):
print i
Note that the parenthesis can be omitted if the generator expression is the only argument passed to a function.