Python generator is a simple way of creating iterator.
There is a lot of overhead in building an iterator in python. We have to implement a class with
__next__() method, keep track of internal states, raise
StopIteration when there was no values to be returned etc.
Iterator in Python is an object that can be iterated upon. An object which will return data, one element at a time. Iterator in python is any python type that can be used with a
for in loop.
Python lists, tuples, dicts and sets are all examples of inbuilt iterators.
Python iterator object must implement two special methods,
__iter__() method returns the iterator object itself. We use the
next() function to manually iterate through all the items of an iterator. When we reach the end and there is no more data to be returned, it will raise
# define a list
my_list = [4, 7, 0, 3]
# get an iterator using iter()
my_iter = iter(my_list)
# iterate through it using next()
# prints 4
# prints 7
# next(obj) is same as obj.__next__()
# prints 0
# prints 3
# This will raise error, no items left
In the next example we will implement a function which give us next power of 2 in each iteration. Power exponent starts from zero up to a user set number.
"""Class to implement an iterator of powers of two"""
def __init__(self, max = 0):
self.max = max
def __iter__(self): self.n = 0
if self.n <= self.max:
result = 2 ** self.n
self.n += 1
obj = PowTwo(4)
iter = iter(obj)
print(next(iter)) # print 1
print(next(iter)) # print 2
print(next(iter)) # print 4
All the overhead we mentioned above are automatically handled by generators in Python. Simply speaking, a generator is a function that returns an object (iterator) which we can iterate over, one value at a time.
It is fairly simple to create a generator in Python. It is as easy as defining a normal function with yield statement instead of a return statement. If a function contains at least one yield statement, it becomes a generator function.
Both yield and return will return some value from a function. The difference is that, while a return statement terminates a function entirely, yield statement pauses the function saving all its states and later continues from there on successive calls.
We have a generator function named
my_gen() with several yield statements.
# Generator function
n = 1
# Generator function contains yield statements
n += 1
n += 1
print('print 3 - last')
# create iterator object
obj = my_gen()
# iterate through the items using next() function
Normally, generator functions are implemented with a loop having a suitable terminating condition.
length = len(my_str)
for i in range(length-1, -1, -1):
for char in rev_str("hello"):
- Easy to Implement
Since generators keep track of details automatically, they can be implemented in a clear and concise way as compared to their iterator class counterpart.
- Memory Efficient
A normal function to return a sequence will create the entire sequence in memory before returning the result. This is an overkill if the number of items in the sequence is very large. Generator implementation of such sequence is memory friendly and is preferred since it only produces one item at a time.
- Represent Infinite Stream
Generators are excellent medium to represent an infinite stream of data. Infinite streams cannot be stored in memory and since generators produce only one item at a time, it can represent infinite stream of data.