Python's New-Style Inheritance Algotrithm

A slightly abbreviated version of this article also appeared on O'Reilly's programming blog in July 2013.
New: see also this related excerpt that summarizes inheritance and super() rules, posted July 2014.

Python's New-Style Inheritance Algorithm

This article takes a brief look at the inheritance search mechanism in the Python programming language. Like some other aspects of Python today, this mechanism varies per line: inheritance has grown much more convoluted in 3.X, though 2.X users still have a choice in the matter. To truly understand the current state of affairs, then, we need to begin our story in simpler times.

Classic Inheritance

Once upon a time (well, in 2.X's default and still widely-used classic classes), Python attribute inheritance—the lookup at the heart of object-oriented code—was fairly simple. It essentially boiled down to this:

Attribute name references search the instance, its class, and the class's superclasses depth-first and left-to-right, and use the first occurrence found along the way. Attribute assignments normally store in the target object itself.

And that's it. The reference search may be kicked off from either an instance or a class, and there are special cases for __getattr__ (run if the lookup failed to find a name) and __setattr__ (run for all attribute assignments), but the procedure is by and large straightforward.

New-Style Inheritance

In new-style classes—an option in 2.X and mandated in 3.X—inheritance is richer but substantially more complex, potentially requiring knowledge of advanced topics to accurately resolve an attribute name's meaning, including descriptors, metaclasses, and the linearized class-tree paths known as MROs. We won't delve into those prerequisite topics here, but the following is a cursory overview of the algorithm used, taken from the newly-released Learning Python, 5th Edition, where you'll find new and more complete coverage.

To look up an attribute name:

  1. From an instance I, search the instance, its class, and its superclasses, as follows:
    1. Search the __dict__ of all classes on the __mro__ found at Iís __class__
    2. If a data descriptor was found in step a, call its __get__ and exit
    3. Else, return a value in the __dict__ of the instance I
    4. Else, call a nondata descriptor or return a value found in step a

  2. From a class C, search the class, its superclasses, and its metaclasses tree, as follows:
    1. Search the __dict__ of all metaclasses on the __mro__ found at Cís __class__
    2. If a data descriptor was found in step a, call its __get__ and exit
    3. Else, call a descriptor or return a value in the __dict__ of a class on Cís own __mro__
    4. Else, call a nondata descriptor or return a value found in step a

  3. In both rule 1 and 2, built-in operations essentially use just step a sources for their implicit name look up (described further in the book)

Name sources in this procedure are attempted in order, either as numbered or per their left-to-right order in "or" conjunctions. On top of all this, method __getattr__ may be run if defined when an attribute is not found; method __getattribute__ may be run for every attribute fetch; and the implied object superclass provides some defaults at the top of every class and metaclass tree (that is, at the end of every MRO).

The assignment subset

A subset of the lookup procedure is also run for attribute assignments:

The __setattr__ method still catches all attribute assignments as before, though it becomes less useful for this method to use the instance __dict__ to assign names, as some new-style extensions such as slots, properties, and descriptors implement attributes at the class level—a sort of "virtual" instance data mechanism. In fact, some instances might not have a __dict__ at all when slots are used, an optimization with significant potential to break introspection-based tools.

Precedence and context

The net effect of these procedures is to impose arguably subtle and obscure precedence rules on the foundational operation of name resolution. For example, you can also think of explicitly-referenced name look-up search as being ordered this way, as implied by the algorithm above (the corresponding steps of which are given in italics here):

Some of this may be too subtle to fathom in full fidelity without a tour through Python's object.c and typeobject.c source files (which house the implementations of instances and classes, respectively). Case in point: Python runs just one or two tree searches per name lookup, despite the presence of four or five name sources.

And all this may happen in addition to selecting a context-specific starting point per the MRO, for code brave enough to use the new-style super built-in function—a tool with new coverage in the book, but omitted here for the sake of space (if not humanity). Technically, super result objects first attempt a custom scan of a context-specific tail portion of a class tree's MRO, looking for an attribute by descriptor or value, and run full inheritance on themselves as a last resort only if this custom scan fails—a special language case for a relatively rare use case.

Famous Last Words?

The new-style inheritance algorithm applies only to programs that use classes and OOP, of course, and grows more elaborate as your code or the libraries it uses adopt more advanced language tools. As usual in Python, some simpler scripts may still not need to care. But there's little denying that new-style name lookup is radically more complex.

To judge for yourself whether its added utility justifies its added complexity, see the aforementioned book for the full story on this topic. And while you ponder such things, don't forget what Python's own "import this" motto seems to counsel in this department:

    If the implementation is hard to explain, it's a bad idea.

[Python Logo] Home Books Programs Blog Python Author Training Email Search ©M.Lutz