Learning Parseltongue


Wizardry in python!




Gaurav Dadhania
@GVRV
 

well... not really :(


magic methods



metaprograming

Magic Methods?!


Surrounded by double underscores

Add "magic" to your classes

Not really magical... but still quite cool!

aka special methods -- you, THE PROGRAMMER, get to define the logic!


construction and initialization

  • __new__(cls, ...)

    Class as the first argument along with other arguments. Should return an instance. Useful for subclassing immutable types like strings and numbers.

  • __init__(self, ...)

    Instance as the first argument, should initialize the instance. Doesn't need to return anything.

  • __del__(self, ...)

    The destructor. Not called when del(x), but when x is garbage collected. Use with caution, no substitute for good coding practices!

CONSTRUCTION AND INITIALIZATION


from os.path import join

class FileObject:
    '''Wrapper for file objects to make sure the file gets closed on deletion.'''

    def __init__(self, filepath='~', filename='sample.txt'):
        # open a file filename in filepath in read and write mode
        self.file = open(join(filepath, filename), 'r+')

    def __del__(self):
        self.file.close()
        del self.file

Operator overloading on custom classes


Make custom objects behave like built-in types


instance1 == instance2

instead of 

instance1.equals(instance2)

Comparison magic methodS

__cmp__(self, other)

Most basic of the comparison magic methods. Defines logic for all different comparisons. Should return positive integer if self is greater than other, zero if self is equal to other and negative integer if self is less than other. Saves repetition and improves clarity, but best to define individual comparison magic methods. Removed in Python 3. 


__eq__(self, other) # == operator

__ne__(self, other) # != operator

__lt__(self, other) # <  operator

__gt__(self, other) # >  operator

__le__(self, other) # <= operator 

__ge__(self, other) # >= operator 

comparison magic methods

class Word(str):
    '''Class for words, defining comparison based on word length.'''

    def __new__(cls, word):
        # Note that we have to use __new__. This is because str is an immutable
        # type, so we have to initialize it early (at creation)
        if ' ' in word:
            print "Value contains spaces. Truncating to first space."
            word = word[:word.index(' ')] # Word is now all chars before first space
        return str.__new__(cls, word)

    def __gt__(self, other):
        return len(self) > len(other)
    def __lt__(self, other):
        return len(self) < len(other)
    def __ge__(self, other):
        return len(self) >= len(other)
    def __le__(self, other):
        return len(self) <= len(other)

Didn't define __eq__ or __ne__ for avoid weird results. 


Use @total_ordering decorator from 'functools' module to define only __eq__ and either __gt__ or __lt__ for complete ordering.

UNARY OPERATORS AND FUNCTIONS

__pos__(self)         # +some_object 

__neg__(self)         # -some_object

__abs__(self)         # abs() function

__invert__(self)      # ~some_object 

__round__(self, n)    # round() function, n decimal places 

__floor__(self)       # math.floor() function 

__ceil__(self)        # math.ceil() function 

__trunc__(self)       # math.truc() function 

NORMAL ARITHMETIC OPERATORS

__add__(self, other)          # + operator 

__sub__(self, other)          # - operator 

__mul__(self, other)          # * operator

__floordiv__(self, other)     # // operator eg: 1.0 / 2.0 = 0.0 

__div__(self, other)          # / operator eg: 1 / 2 = 0

__truediv__(self, other)      # true division. Needs 'from __future__
                              # import division' to work eg: 1/2 = 0.5

__mod__(self, other)          # % operator 

__divmod__(self, other)       # long division using divmod() function

__pow__(self, other[, modulo])          # ** operator 

__lshift__(self, other)       # << operator 

__rshift__(self, other)       # >> operator 

__and__(self, other)          # & operator 

__or__(self, other)           # | operator 

__xor__(self, other)          # ^ operator

reflected arithmetic operators

my_object + some_other_object

If my_object.__add__(self, some_other_object) is not defined or returns NotImplemented exception, some_other_object.__radd__(self, my_object) is called. 

We call them reflected arithmetic magic methods.

__radd__(self, other)
__rsub__(self, other)
__rmul__(self, other)
__rfloordiv__(self, other)
__rdiv__(self, other)
__rtruediv__(self, other)
__rmod__(self, other)
__rdivmod__(self, other)
__rpow__(self, other[, modulo])
__rlshift__(self, other)
__rrshift__(self, other)
__rand__(self, other)
__ror__(self, other)
__rxor__(self, other)

augmented assignment

__iadd__(self, other)              # += operator

__isub__(self, other)              # -= operator

__imul__(self, other)              # *= operator

__ifloordiv__(self, other)         # //= operator

__idiv__(self, other)              # / operator

__itruediv__(self, other)          # when true division is imported

__imod__(self, other)              # %= operator

__ipow__(self, other[, modulo])    # **= operator

__ilshift__(self, other)           # <<= operator

__irshift__(self, other)           # >>= operator

__iand__(self, other)              # &= operator

__ior__(self, other)               # |= operator

__ixor__(self, other)              # ^= operator

type conversion magic methods

__int__(self)       # conversion to int

__long__(self)      # conversion to long

__float__(self)     # conversion to float

__complex__(self)   # conversion to complex

__oct__(self)       # conversion to octal

__hex__(self)       # conversion to hexadecimal 

__index__(self)     # if custom numeric type used in slicing

__trunc__(self)     # truncate decimal places

__coerse__(self, other) # method for mixed-mode arithmetic. Returns
                        # None if type conversion is impossible. Else
                        # returns a 2-tuple of (self, other) such that 
                        # both objects are of the same type. Removed in
                        # Python 3 for confusing behaviour. 

REPRESENTING CUSTOM CLASSES

__str__(self)      # When str() is called. Intended audience: humans
                   # Does not fall back to __unicode__(). 

__repr__(self)     # When repr() is called. Intended audience: machines

__unicode__(self)  # When unicode() is called. 

__format__(self, formatstr) # "{0:bc}".format(a) -> a.__format__("bc")

__hash__(self)     # Returns an integer for quick key comparison in 
                   # dicts. Guide, if a == b, hash(a) == hash(b).

__nonzero__(self)  # When bool is called. Should return True/False.
                   # Renamed to __bool__ in Python 3

__dir__(self)      # When dir() is called. Should return list of
                   # attributes for the object. Only necessary when 
                   # dynamically adding/removing attributes.

__sizeof__(self)   # When sys.getsizeof() called. Should return answer
                   # in bytes. Mostly necessary for classes implemented
                   # in C extensions

attribute access

__getattr__(self, name)

Called when an attribute that doesn't exist is requested. 


__setattr__(self, name, value)

Called for setting an attribute, whether or not it exists. 


__delattr__(self, name)

Called for deleting an attribute.


__getattribute__(self, name)

New style classes. Invoked without checking if attribute exists. Hard to implement bug-free. If implemented, __getattr__ only called if AttributeError is raised. 

Attribute Access

Infinite recursion caused by improper usage of __setattr__.
def __setattr__(self, name, value):
    self.name = value
    # since every time an attribute is assigned, __setattr__() is called, this
    # is recursion.
    # so this really means self.__setattr__('name', value). Since the method
    # keeps calling itself, the recursion goes on forever causing a crash

We fix it by inserting attribute in dictionary of attributes. Or we can call __setattr__ on our super class as not all objects are guaranteed to have a __dict__ attribute.

def __setattr__(self, name, value):
    self.__dict__[name] = value # assigning to the dict of names in the class
    # define custom behavior here


custom sequences - protocols

To make custom sequences (dict, tuple, list, string, etc) in Python, you need to follow certain protocols. 

  • Immutable container: Need to define __len__ and __getitem__
  • Mutable container: Need to define __setitem__ and __delitem__ as well
  • Iterable container: Need to define __iter__ which returns an iterator
  • Iterator: Need to define __iter__ which returns itself, and 'next'. 

Python protocols are informal and require no explicit declarations to implement.

container magic methods

__len__(self) # Returns length of the container

__getitem__(self, key) # When self[key] is used

__setitem__(self, key, value) # When self[key] = value is used 

__delitem__(self, key) # When del self[key] is used

__iter__(self) # When iter() is called or in 'for x in' loops 

__reversed__(self) # When reversed() is called. For ordered containers.

__contains__(self, item) # When 'in' and 'not in' are used

__missing__(self, key) # For dict subclasses, when missing keys are 
                       # accessed


Care should be taken to raise appropriate exceptions as well. KeyError used when key is missing/needed. TypeError is used when type of the key is not compatible. 

reflection

__instancecheck__(cls, instance)

Called when isinstance(instance, class) is used. 


__subclasscheck__(cls, subclass)

Called when issubclass(subclass, class) is used.



callable objects

Functions are first-class objects in Python. Your classes can be called like functions as well. 


__call__(self[, args])

Allows an instance of a class to be called as a function. Useful for objects that change "state" very often. 

context managers

Context managers allow the setup and cleanup actions to be taken for objects when their creation is wrapped in a 'with'

__enter__(self)

Determines what to do at the beginning of 'with' block


__exit__(self, exception_type, exception_value, traceback)

What to do after the execution of 'with' block - cleanup, handle exceptions, etc. Arguments will be None if there is no exception. If you handle raised exception, return True. 

context managers

Useful for classes with well-defined and common behaviour for setup and cleanup. 


class Closer:
    '''A context manager to automatically close an object with a close method
    in a with statement.'''

    def __init__(self, obj):
        self.obj = obj

    def __enter__(self):
        return self.obj # bound to target

    def __exit__(self, exception_type, exception_val, trace):
        try:
           self.obj.close()
        except AttributeError: # obj isn't closable
           print 'Not closable.'
           return True # exception handled successfully

DESCRIPTOR OBJECTS

__get__(self, instance, owner) # When descriptor's value is retrieved

__set__(self, instance, value) # When descriptor's value is changed 

__delete__(self, instance) # When descriptor's value is deleted
class Meter(object):
    '''Descriptor for a meter.'''

    def __init__(self, value=0.0):
        self.value = float(value)
    def __get__(self, instance, owner):
        return self.value
    def __set__(self, instance, value):
        self.value = float(value)

class Foot(object):
    '''Descriptor for a foot.'''

    def __get__(self, instance, owner):
        return instance.meter * 3.2808
    def __set__(self, instance, value):
        instance.meter = float(value) / 3.2808

class Distance(object):
    '''Class to represent distance holding two descriptors for feet and
    meters.'''
    meter = Meter()
    foot = Foot()

copying objects

__copy__(self)

Defines behaviour for copy.copy() - only returns a shallow copy i.e. object is copied, but data is only referenced. 


__deepcopy__(self, memodict={})

Defines behaviour for copy.deepcopy() - completely copies data as well. Need to be careful with recursive data structures. memodict is cache of previously copied objects - when you want to deep copy an attribute, call deepcopy() on it and pass it memodict. 


These methods give you fine grained control over how you want to copy your objects for better performance. 

pickling

Pickling is the serialization process for Python data structures - so they can be stored and retrieved later. Custom objects can be pickled by following protocol. 

__getinitargs__(self)  # Returns a tuple of args sent to __init__ 
                       # for old-style classes when unpickling

__getnewargs__(self)   # Returns a tuple of args sent to __new__ for 
                       # new style classes when unpickling

__getstate__(self)     # Instead of object's __dict__ attribute being
                       # stored, you can provide a custom state

__setstate__(self, state) # If defined, when unpickled, object's state
                          # will be passed to it instead of setting
                          # object's __dict__ attribute to it

__reduce__(self)       # For extension types 

__reduce_ex__(self)    # For compatibility reasons

pickling

import time

class Slate:
    '''Class to store a string and a changelog, and forget its value when
    pickled.'''

    def __init__(self, value):
        self.value = value
        self.last_change = time.asctime()
        self.history = {}

    def change(self, new_value):
        # Change the value. Commit last value to history
        self.history[self.last_change] = self.value
        self.value = new_value
        self.last_change = time.asctime()

    def print_changes(self):
        print 'Changelog for Slate object:'
        for k, v in self.history.items():
            print '%s\t %s' % (k, v)

    def __getstate__(self):
        # Deliberately do not return self.value or self.last_change.
        # We want to have a "blank slate" when we unpickle.
        return self.history

    def __setstate__(self, state):
        # Make self.history = state and last_change and value undefined
        self.history = state
        self.value, self.last_change = None, None

abstract base classes

class Foo(object):
    def __getitem__(self, index):
        ...
    def __len__(self):
        ...
    def get_iterator(self):
        return iter(self)

class MyIterable:
    __metaclass__ = ABCMeta

    @abstractmethod
    def __iter__(self):
        while False:
            yield None

    def get_iterator(self):
        return self.__iter__()

    @classmethod
    def __subclasshook__(cls, C):
        if cls is MyIterable:
            if any("__iter__" in B.__dict__ for B in C.__mro__):
                return True
        return NotImplemented

MyIterable.register(Foo)


metaprogramming


MEta-pro-gramm-ing


Factories that build factories that build cars

Programs that manipulate other programs

classes are just objects

>>> class Foo: pass
...
>>> Foo.field = 42
>>> x = Foo()
>>> x.field
42
>>> Foo.field2 = 99
>>> x.field2
99
>>> Foo.method = lambda self: "Hi!"
>>> x.method()
'Hi!'

You can modify them how you modify objects

Classes are just objects

You can add and subtract fields and methods, for example. The difference is that any change you make to a class affects all the objects of that class, even the ones that have already been instantiated.


What creates these special “class” objects? Other special objects, called metaclasses.


The default metaclass is called type

metaclasses create classes

Classes create instances

class C: pass

is

C = type('C', (), {})

creating classes

def howdy(self, you):
    print("Howdy, " + you)

MyList = type('MyList', (list,), dict(x=42, howdy=howdy))

ml = MyList()
ml.append("Camembert")
print(ml) # ["Camembert"]
print(ml.x) # 42
ml.howdy("John") # Howdy, John

print(ml.__class__.__class__) # Prints the metaclass, "type"
Using this, you can dynamically generate classes

custom metaclasses

class SimpleMeta(type):
    def __init__(cls, name, bases, nmspc):
        super(SimpleMeta1, cls).__init__(name, bases, nmspc)
        cls.uses_metaclass = lambda self : "Yes!"

class Simple(object):
    __metaclass__ = SimpleMeta
    def foo(self): pass
    @staticmethod
    def bar(): pass

simple = Simple()

Specify the __metaclass__ callable. Should accept the same arguments as type.

class Simple(object, metaclass=SimpleMeta):

By convention, when defining metaclasses cls is used rather than self as the first argument to all methods except __new__() (which uses mcl). cls is the class object that is being modified.

metaclass is just a callable

class Simple2(object):
    class __metaclass__(type):
        def __init__(cls, name, bases, nmspc):
            # This won't work:
            # super(__metaclass__, cls).__init__(name, bases, nmspc)
            # Less-flexible specific call:
            type.__init__(cls, name, bases, nmspc)
            cls.uses_metaclass = lambda self : "Yes!"

class Simple4(object):
    def __metaclass__(name, bases, nmspc):
        cls = type(name, bases, nmspc)
        cls.uses_metaclass = lambda self : "Yes!"
        return cls

registering subclasses

class RegisterLeafClasses(type):
    def __init__(cls, name, bases, nmspc):
        super(RegisterLeafClasses, cls).__init__(name, bases, nmspc)
        if not hasattr(cls, 'registry'):
            cls.registry = set()
        cls.registry.add(cls)
        cls.registry -= set(bases) # Remove base classes

    # Metamethods, called on class objects:
    def __iter__(cls):
        return iter(cls.registry)

    def __str__(cls):
        if cls in cls.registry:
            return cls.__name__
        return cls.__name__ + ": " + ", ".join([sc.__name__ for sc in cls])

final classes

class final(type):
    def __init__(cls, name, bases, namespace):
        super(final, cls).__init__(name, bases, namespace)
        for klass in bases:
            if isinstance(klass, final):
                raise TypeError(str(klass.__name__) + " is final")

init vs. new in metaclasses

__new__ is called for the creation of a new class, while __init__ is called after the class is created, to perform additional initialization before the class is handed to the caller


When overriding __new__() you can change things like the ‘name’, ‘bases’ and ‘namespace’ arguments before you call the super constructor and it will have an effect, but doing the same thing in __init__() you won’t get any results from the constructor call

class methods and meta methods

A metamethod can be called from either the metaclass or from the class, but not from an instance. A classmethod can be called from either a class or its instances, but is not part of the metaclass.

class Singleton(type):
    instance = None
    def __call__(cls, *args, **kw):
        if not cls.instance:
             cls.instance = super(Singleton, cls).__call__(*args, **kw)
        return cls.instance

class ASingleton(object):
    __metaclass__ = Singleton

a = ASingleton()
b = ASingleton()
assert a is b
print(a.__class__.__name__, b.__class__.__name__)

Prepare metamethod

In Python 3, you can use the __prepare__ metamethod to change the default dict of the class into any object with a __setitem__ method. 
@classmethod
def __prepare__(mcl, name, bases):
    return odict()



use the magic


don't overuse the magic

questions?


Resources



Thank you!

Learning parseltongue

By gvrv

Learning parseltongue

PyCon AU talk on Learning Parseltongue - exploring magic methods and metaprogramming patterns in Python

  • 2,082