Flattening List Comprehensions in Python

List Comprehension

Python has list comprehensions, syntactic sugar for building lists from an expression.

>>> [2 * i for i in (2, 3, 5, 7, 11)]
[4, 6, 10, 14, 22]

This doesn't work so well when the comprehension expression is itself a list: you end up with a list of lists.

>>> def gen():
...     for l in [['a', 'b'], ['c'], ['d', 'e', 'f']]:
...         yield l
...
>>> [l for l in gen()]
[['a', 'b'], ['c'], ['d', 'e', 'f']]

This is ugly. Here's one way to build a flattened list, but it's less elegant than the comprehension.

>>> x = []
>>> for l in gen():
...     x.extend(l)
...
>>> x
['a', 'b', 'c', 'd', 'e', 'f']

It took me a while to find a readable list comprehension, with a little help from Google. Use sum() on the outer list and prime it with an empty list, []. Python will concatenate the inner lists, producing a flattened list.

>>> sum([l for l in gen()], [])
['a', 'b', 'c', 'd', 'e', 'f']

Alternatively, you can use itertools.chain().

>>> import itertools
>>> list(itertools.chain(*gen()))
['a', 'b', 'c', 'd', 'e', 'f']

That might be slightly more efficient, though I find the sum() to be a little more readable.

>>> import itertools
>>> list(itertools.chain(*gen()))
['a', 'b', 'c', 'd', 'e', 'f']

That might be slightly more efficient, though I find the sum() to be a little more readable.

Edit: I forgot about nested comprehensions

>>> [inner
...     for outer in gen()
...         for inner in outer]
['a', 'b', 'c', 'd', 'e', 'f']

Somewhat cryptic on one line however:

>>> [j for i in gen() for j in i]
['a', 'b', 'c', 'd', 'e', 'f']

9 Comments

  • This is great info on all of the possible flattening solutions. The 'sum' version is certainly less verbose than the 'chain' approach. However, it always throws me off that sum can be used to concatenate (nested) lists not just numbers. After all, the word "sum" commonly denotes the use of numbers.

    I would not mind 'sum' being used for lists if it could also be used for strings. For example, while the following is permissible:

    >>> sum([['b'], ['c'], ['d']], ['a'])
    ['a', 'b', 'c', 'd']

    this is not:

    >>> sum(['b', 'c', 'd'], 'a')

    Traceback (most recent call last):
    File "", line 1, in
    sum(['b', 'c', 'd'], 'a')
    TypeError: sum() can't sum strings [use ''.join(seq) instead]

    Isn't a string a special type of list, anyway? For example, list comprehensions certainly supports manipulating strings. Also, the '+' operator can be used on strings, then why not sum?

    Python is built on the premise of clarity and terseness, but the recommended syntax of using ''.join(seq) is somewhat less intuitive than if sum could be used. For the same reasons why flattening the nested lists using sum is more readable to you, using sum with strings could make it a more readable alternative than using join.

  • I'm used to join -- similar idioms are in C# and JavaScript -- so it hadn't really struck me as a wart.

    Here's a sum function that sort of does what you want:

    >>> def sum_(*l):
    ... return ''.join(l)

    >>> sum_('b', 'c', 'd', 'a')
    'bcda'

    Though it won't work with the example you gave, sum_(['b', 'c', 'd'], 'a'), as that's concatenating a list of strings with a string. That in turn could be fixed by flattening any arguments that are lists, then joining the result.

  • >> I'm used to join -- similar idioms are in C# and JavaScript -- so it hadn't really struck me as a wart.

    Yes that is true. But, if I was someone more comfortable with functional programming language idioms of map, reduce, filter as found in Python then I might use those over list comprehensions although LCs can be viewed as far more elegant.

    For some reason, ''.join just feels awkward to me for simple concatenations of lists than if I could use sum. Maybe it's just me. Just nice to have alternatives built into the language then having to roll my own.

    It is all subjective I guess. :-)

  • Sorry for replying to such an old post, but I just stumbled across it while googling for something else. Anyway, I must be really dense, but I don't see the point of the generator.

    In Haskell, given that my original list is xss, I would flatten it with:

    [x | xs <- xss, x >> ll = [['a', 'b'], ['c'], ['d', 'e', 'f']]

    You can simply flatten it with:

    >>> [x for l in ll for x in l]
    ['a', 'b', 'c', 'd', 'e', 'f']

  • BOGTrZ im subscribing to this rss totally

  • bang ouch, thats cool

  • eVG4h9 This is the first time i've heard of an Seo camp. Really interesting and i will be attending.

  • Hah, seriously? That's rediculous. No way

  • It is rather entry amount when it comes to black color metal without reserving unyielding passion to match another goal. Phantom’s cries are usually very common from the genre, with the right distance provided to the harshness.

Comments have been disabled for this content.