Generators, Comprehensions and Iterable data structures

Kalil de Lima - Python Meetup

https://github.com/kaozdl

kalil.de.lima@fing.edu.uy

kalil@rootstrap.com

What's an iterable?

An iterable is an object that represents a collection of data. Such collection should be able to provide his elements one at a time.

  • Implements the __iter__() method
  • Implements the __next__() method
  • Can be looped over with for ... in ... :
  • A list can be generated from such object

Properties

A quick example

class MyNumbers:
  def __iter__(self):
    self.a = 1
    return self

  def __next__(self):
    if self.a <= 20:
      x = self.a
      self.a += 1
      return x
    else:
      raise StopIteration

myclass = MyNumbers()
myiter = iter(myclass)

#Now we can iterate with this
for x in myiter:
  print(x)

#Or with
for x in MyNumbers():
  print(x)

Generators

A generator is a function with a yield statement.

Unlike normal functions, the values of the local variables are not destroyed after the function yields, preserving the state of execution.

This defines an iterator with the following advantages:

Generators

  • Implements __next__ and __iter__ automatically
  • Iteration stop is taken care of automatically
  • Iteration related exceptions are handled
  • Nicer syntax in most usual cases
  • Does not require to define a class

Some examples

#All the even natural numbers
def all_even():
    n = 0
    while True:
        yield n
        n += 2

#Circular lists
def circular(start, end):
    current = start
    while True:
        yield current
        if current == end:
           current = start
        else:
           current += 1

Generator Expressions

  • Quick way of defining simple generators
  • Does not need to declare a function
  • Returns an iterable so they can be looped over immediately

Some Examples

#A nice alternative to map
squares = (x*x for x in values)

#Useful for filtering
even_squares = (x*x for x in values if x % 2 == 0)

#Can be iterated right away
for address in (a in addresses if 'wes' in a):
    print(address)

#Can be fed to builtins right away
{ x: x*x for x in values }

Why should i use all this?

  • Reduces memory footprint
  • Improves readability for simple situations
  • Allows quick filtering due to if clause
  • Similar to map for quick transformations
  • Easy to read notation
  • Accepted as arguments for most of the iterable builtins like dict, str and list
  • Looks Cool

When should I use this?

  • Whenever you need to process elements one by one
  • When you can't load a whole list of elements in memory for processing
  • When you need lazy loading
  • When an element in the iterator generates the next
  • When you only want to work with a subset of an iterable that you already have

When shouldn't I?

  • When you need to iterate back and forth
  • When you need to go several times trough the same iterable
  • When you can solve it with a list
  • When you can't solve it with one line

A quick Bonus

Even tough is not released yet, Python 3.8 introduces assignment expressions, which can be used inside generator expressions to improve the readability and memoize function calls

#Now
some_list = [
    generate_expensive_element(value)
    for value in some_iterable 
        if meets_condition(generate_expensive_element(value))
    ]


#In python 3.8
some_list = [
    (elem := generate_expensive_element(value)) #Element gets memoized
    for value in some_iterable 
        if meets_condition(elem)
    ]

Thanks for listening!

All code examples are available at:

https://github.com/kaozdl/python_meetup

Yields, comprehensions and iterable data structures

By Kalil De Lima

Yields, comprehensions and iterable data structures

  • 273