Iterator Pattern Basics in Python

Last Updated: 18th December 2018

I discovered the value of iterators for the first time many years ago when learning C++. Once grasping the concept, programming in C++ because a lot more enjoyable, and code became much more easy to understand. In this post, I will discuss the value of, and implementation of iterators using the Python programming language to decouple algorithms from containers.

The iterator pattern exists to provide a standard interface for a class that contains a collection of items, in order to easily iterate over the sequence in a way that does not expose the representation of the items being iterated over. This means that if you ever need to change the representation in the future for optimisation purposes for example, you won’t need to change the code that performs the iteration. In other words, it allows you to program in a functional style by stating that you want to iterate over the container rather than describing how to do it.

The iterator pattern is a very useful concept and technique to be aware of and I’m very pleased that it is supported in Python. It is independent of the Python programming language, but helps to write much more readable and maintainable code.

Consider the following class that defines a data structure with a very simple API that purely allows a client to store items in no particular order:

class Bag:
    def __init__(self):
        self.contents = []

    def add(self, item):
        self.contents.append(item)

    def is_empty(self):
        return len(self.contents) == 0

    def size(self):
        return len(self.contents)

Now if I want to loop over the contents of an instance of this Bag class, then the naive worst case scenario is this:

myBag = Bag()

# code to add items to myBag omitted

for i in range(0, len(myBag.contents)):
    print(myBag.contents[i])

This requires the developer to know the details of the implementation of the Bag class. They need to know about the existence of the contents variable, and that it is an array. If I want to switch the array in the Bag class with a linked list for example, I will then have to change code in multiple places, in the class definition and everywhere else that performs the loop like above.

This also violates the Liskov Substitution Principle which is nice to have if you ever want to substitute the instance of the Bag class with any other container object without any unexpected side effects.

I therefore want everyone to do it this instead:

for item in myBag:
    print(item)

I also want people to do:

if item in myBag:
    do_something()

According to the Python documentation we need to allow our container class to return an iterator object. This is done by implementing the the __iter__() method. An iterator object is decoupled from the container and supports the iterator protocol. It has the responsibility of performing the iteration.

In order to support the iterator protocol, the iterator class ArrayForwardIterator in the example below needs to support the __iter__() and __next__() methods. The first returns the iterator object itself, and the second returns the next item in the container. If no more items are left, the StopIteration exception is raised.

To put this into practice:

class Bag:
    # existing code omitted

    def __iter__(self):
        return ArrayForwardIterator(self.contents)

class ArrayForwardIterator:
    def __init__(self, arr):
        self.arr = arr
        self.index = 0

    def __iter__(self):
        return self

    def __next__(self):
        if self.index == len(self.arr):
            raise StopIteration()

        item = self.arr[self.index]
        self.index += 1
        return item

This is the minimum necessary to do what we want to do. It will be nicer to be able to pass the iterator object as an argument to the Bag constructor as per the dependency inversion principle. This will make it much easier to create unit tests for the individual classes as well as add the flexibility to choose what kind of algorithm I would like to use for iteration at the point of instantiation of the Bag class.

View source code.