Why Generators?
Before we learn about generators, let's try to understand why we needed them in the first place.
Here's the code we have used to create custom Iterators in Python.
class Even:
def __init__(self, max):
self.n = 2
self.max = max
def __iter__(self):
return self
# customize the next() method to return only even numbers
def __next__(self):
if self.n <= self.max:
result = self.n
self.n += 2
return result
else:
raise StopIteration
numbers = Even(10)
print(next(numbers))
print(next(numbers))
print(next(numbers))
print(next(numbers))
Output
2 4 6 8
In the above example, we have created a custom iterator to generate a sequence of even numbers.
We know that for an object to be an iterator, it should implement:
- the
__iter__()
method - returns an iterator object - the
__next__()
method - returns the next element in the stream and raise theStopIteration
exception when there are no values to be returned
As you can see, this process is both lengthy and counterintuitive. Here's where generators come to the rescue.
In Python, generators provide an easier way to create custom iterators.
Python Generators
A generator is a function that uses the yield keyword to get the next item of the iterator. Let's now implement the same iterator from the previous example using a generator.
def even_generator():
n = 0
n += 2
yield n
n += 2
yield n
n += 2
yield n
n += 2
yield n
numbers = even_generator()
print(next(numbers))
print(next(numbers))
print(next(numbers))
print(next(numbers))
Output
2 4 6 8
In the above example, we have created a generator function with 4 yield
statements.
The yield
statement is similar to that of return,
with one major difference. The return
statement terminates the function completely while the yield
statement pauses the function saving all its states for the next successive calls.
Here's how the above code works:
- The generator function automatically implements
__iter__()
and__next__()
method. - When we call the generator function, it returns an iterator object.
- Then, we call the
__next__()
method to retrieve elements from the iterator. - During the first function call, the first
yield
returns the valuen = 0 + 2 = 2
. - Similarly, during the 2nd, 3rd, and 4th function calls, the consecutive
yield
statements return 4, 6, and 8, respectively.
StopIteration Exception with Generators
At this point, all the yield statements are executed, now, if we make another call to the generator, it will raise the StopIteration
exception. For example,
def even_generator():
n = 0
n += 2
yield n
n += 2
yield n
n += 2
yield n
n += 2
yield n
numbers = even_generator()
print(next(numbers))
print(next(numbers))
print(next(numbers))
print(next(numbers))
print(next(numbers))
Output
2 4 6 8 Traceback (most recent call last): File "<string>", line 22, in <module> StopIteration
As you can see, we get the StopIteration
exception. It's because we have 4 yield
statements, however, we are making the call to the next()
method 5 times.
while Loop Inside Generator
Here, we are repeating the same code again and again.
n += 2
yield n
Instead of this, we can simply use a while loop in Python. For example,
def even_generator(max):
n = 0
while n <= max:
n += 2
yield n
numbers = even_generator(8)
print(next(numbers))
print(next(numbers))
print(next(numbers))
print(next(numbers))
Output
2 4 6 8
Note that we have never defined the __iter__()
method, __next__()
method, or raised a StopIteration
exception. They are handled automatically by generators making our program much simpler and easier to understand.
Infinite Stream of Data with Generators
Python iterators and generators are generally used to handle a large stream of data, theoretically even an infinite stream of data. These large streams of data cannot be stored in memory at once.
To handle this, we can use generators that will only work with one item at a time.
Let's see one example of the fibonacci series, where the next element is the sum of the last two elements:
0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, .....
We will build a generator to produce an infinite stream of fibonacci numbers.
def generate_fibonacci():
n1 = 0
yield n1
n2 = 1
yield n2
# generate an infinite fibonacci series
while True:
n1, n2 = n2, n1 + n2
yield n2
seq = generate_fibonacci()
print(next(seq))
print(next(seq))
print(next(seq))
print(next(seq))
print(next(seq))
Output
0 1 1 2 3
Here, you can just call the generator function and generate fibonacci series as much as you like.
If we have used a for
loop and a list to store this infinite series, we would have run out of memory.
However, with generators, we are only dealing with one item at a time, so we are able to access them as long as we want.
Also, notice that we are repeating the same code of function call. So, we can just replace the code
print(next(seq))
print(next(seq))
print(next(seq))
print(next(seq))
print(next(seq))
with
for num in range(5):
print(next(seq))
Infinite Stream of Odd Numbers using Generator
Let's see one more example. This time we will generate a stream of odd numbers using the generator.
def generate_odd():
n = 1
while True:
yield n
n += 2
odd_numbers = generate_odd()
for num in range(5):
print(next(odd_numbers))
Output
1 3 5 7 9
Generator Expression
Similar to the list comprehension, we can use the generator expression to create generators easily.
Let's see an example,
# create a list using list comprehension
first_list = [numbers**2 for numbers in range(5)]
print(first_list)
# create a generator using generator expression
generator_num = (numbers**2 for numbers in range(5))
print(generator_num)
Output
[0, 1, 4, 9, 16] <generator object <genexpr> at 0x7fa4ecc325f0>
From the syntax, you can see that the list comprehension uses square bracket whereas the generator expression uses parentheses.
Also, when we print the list, we get the newly created list, however, when we print the generator, we get a generator object. This is because, unlike list comprehension, generator expression doesn't produce the entire list at once.
If we want to access the items of the generator, we have to call the next()
method. For example,
# create a list using list comprehension
first_list = [numbers**2 for numbers in range(5)]
print(first_list)
# create a generator using generator expression
generator_num = (numbers**2 for numbers in range(5))
# access item of generator
for num in range(5):
print(next(generator_num))
Output
[0, 1, 4, 9, 16] 0 1 4 9 16
As you can see, we have successfully created a generator using the generator expression and access the elements one at a time.
Recommended Reading: Python Decorators