Comprehensions and Generator Expressions

A comprehension is a special bit of syntax that can appear almost anywhere an expression list is allowed. It is recognized by the keyword for in the comprehension.

There are many types of comprehensions, and we’ll cover each one as we cover the associated type. For now, we will only cover the generator expression, but by doing so, we’ll learn the full syntax for the comprehension, which can be substituted in later.

Simple Syntax

The simple syntax of the comprehension is as follows:

<expr> for <target list> in <expr list>

This mirrors the syntax of the for statement, except that it doesn’t have a full-blown suite. Instead, the entire body of the for loop is the expression on the left hand side.

What, specifically, this means depends on the context of the comprehension. I’m going to show you two contexts:

Generator Expression

We have already covered generators. These are functions with at least one yield statement or expression in them.

You can create generators on the fly if you surround the comprehension with parentheses:

>>> g = (i**2 for i in range(10))
>>> next(g)
0
>>> next(g)
1
>>> next(g)
4
>>> next(g)
9
>>> next(g)
16

Remember that generators are iterators, so anywhere you can use an iterable, you can use the generator expression above.

Using a Comprehension in a Function Call

Many functions take an iterable as its only parameter. You can replace the parameter list with a comprehension.

min(i%5 for i in range(1,8,3))

Note that no other parameters are allowed.

If you wanted to allow other parameters, just use the generator syntax:

sum((i**2 for i in range(10)), 8)

Nesting for Blocks

You can chain for blocks in a comprehension. Keep in mind that if your comprehension gets too complicated, it will be hard to understand. It may be easier to write a full for statement.

<expr> for <target list> in <expr list> for <target list> in <expr list>

Typically, we use this to iterate across two indexes:

(i,j) for i in range(3) for j in range(3)

The later for blocks are embedded in the earlier ones:

>>> g = ((i,j) for i in range(3) for j in range(3))
>>> next(g)
(0, 0)
>>> next(g)
(0, 1)
>>> next(g)
(0, 2)
>>> next(g)
(1, 0)
>>> next(g)
(1, 1)
>>> next(g)
(1, 2)
>>> next(g)
(2, 0)
>>> next(g)
(2, 1)
>>> next(g)
(2, 2)
>>> next(g)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

Nesting if Blocks

After the first for block, you can have one or more if blocks. These nest much like for blocks do.

<expr> for <target_list> in <expr list> if <cond>

Only if the condition is True will the iteration be used. Otherwise, it is skipped.

>>> g = (i for i in range(10) if i%5==0)
>>> next(g)
0
>>> next(g)
5
>>> next(g)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

Combining Multiple blocks

If you visualize comprehensions this way, you can easily see what is happening when you see a complicated comprehension.

  • Nest the if and for blocks according to their order.

  • If the condition is False (if not cond:) then continue. Or, rather, only keep going if the condition is True.

  • yield the <expr> in the center.

For example, let’s write out a complicated comprehension:

<expr> for <target list 1> in <expr list 1> \
     if <cond 1> \
     for <target list 2> in <expr list 2> \
     for <target list 3> in <expr list 3> \
     if <cond 2>

for <target list 1> in <expr list 1>:
  if <cond 1>:
    for <target list 2> in <expr list 2>:
      for <target list 3> in <expr list 3>:
        if <cond 2>:
          yield <expr>

Of course, if you’re using more than a few blocks in your comprehension, you’re probably better off re-writing it as a full-blown for loop, for the sake of legibility.

Example: Matrix Indexes

In this example, we want to generate all of the (i,j) pairs of matrix coordinates for a 3x3 matrix.

(i,j) for i in range(3) for j in range(3)

This could be used to rewrite our matrix addition routine, but using comprehensions for a for expression list is a little weird:

for i,j in ((i,j) for i in range(3) for j in range(3)):

for i in range(3):
    for j in range(3):

Example: Simple Transformation

The most common way I use comprehensions is if I want to do something simple to a sequence, aka map(). Let’s compare these two equivalent functions:

(x**2 for x in range(100))

map(lambda x: x**2, range(100))

Example: Simple Filtering

We can also replace filter() with a comprehension:

(i for i in range(100) if i%7 == 1)

filter(lambda i: i%7 == 1, range(100))

Analysis

Comprehensions are a neat little trick in Python. It is fairly powerful, and if you are familiar with SQL, you should be getting a SQL vibe from them.

However, I recommend using them sparingly, if at all. Generally, if you have occasion to use a comprehension, it is either such a simple case that you can avoid doing it at all, or it will grow to be so complex that you wouldn’t want to use it at all.

Also note, comprehensions are a feature of Python that isn’t shared by many languages. People coming from a C/C++ or Java background will find them entirely new, and will have to read the documentation on Python to understand what they do. So it has a bit of a “Not for Noobs” vibe to them.

Beginning Python programmers will be utterly confused by comprehensions and such, so if you are working on software with a team of people who aren’t very good at Python yet, it’s best to avoid it for the sake of efficiency.

We will mention them as they arise in the Python syntax, but again, use it sparingly, if at all, and only for the simplest cases.