Object Inheritance¶
Python, like many Object-Oriented programming languages, allows you to create new types based off of old ones. There are a number of reasons why you might want to do this. In this video lecture, I’ll cover how it works, why you might want to do it, and give you some warnings and cautions.
Why Inheritance¶
If you spend any time creating classes, you’ll quickly note some patterns that you want to make use of.
Many of your objects will have very similar, if not identical, methods.
Many of your objects will share the same attributes, with the exact same behavior for the attributes.
Sometimes you want to create a new class by just extending, that is, adding new things to an existing class, but you don’t want to touch the existing class.
Occasionally you want to modify a method or an attribute for the new class.
All of these things are handled by inheritance.
Terminology¶
The fundamental idea of inheritance is that you can create classes that inherit from or derive from another class. This means that unless overridden, the attributes and methods of that class will appear as-is in the derived class.
We call the class that is inherited the base, parent, or super class. I prefer “super” because that is the name of a special function in Python that allows you to access the class’s methods.
The derived classes are called derived, child, or sub classes.
Note that you can inherit a class that inherits from another class, and so this relationship can be many levels deep. That is, a single class can have a super class that itself has a super class which has a super class.
You can also have a subclass that has multiple superclasses. We’ll talk about Method Resolution Order (MRO) later, which will explain how Python figures out which class to look at next to determine where an attribute or method lives.
Ultimately, all classes should have the object
class as its final super
class. We don’t talk a lot about object
because there isn’t much to do
with it. It has some of the most important special methods defined (like
__getattribute__
) so we generally don’t want to mess with it.
Function¶
The core element of inheritance is the following:
Subclasses get all of the methods and attributes of the superclass.
Subclasses can specify entirely new attributes or methods. This does not affect the superclass.
Subclasses can override attributes or methods from the superclass. Even methods specified in the superclass will see these overriding attributes and methods.
Even if you override a method, you can still access that method through the
super()
function.
Alternatives¶
Inheritance might not be the right solution. There are a number of alternatives you may want to consider.
Rather than inheriting from a class, can you just have an instance of the class as an attribute? This is useful if you don’t really share many attributes with the inherited class.
Maybe a shared method or attribute is really a global function or variable.
Rather than relying on superclasses to derive information about taxonomy (how you organize your objects) consider using an attribute with the taxonomy. Typically, things will fall into multiple categories, and it’s not always clear which categories your users will be interested in. IE, a man is an animal, and accountants are a man, but some accountants are female which makes them a woman, so not all men are a man? This sort of nonsense can be resolved by remembering that the object is of species “Homo Sapien” and has the job of “Accountant” and has the sex “Female”, rather than relying on the class structure.
How to Create Derived Types¶
First, when you declare a new class, you can specify one or more types as parameters to the class name. This is the syntax:
class NAME(bases): SUITE
The bases
can be one or more types, separated by commas.
This is equivalent to creating a new type with the type
function:
NAME = type('NAME', bases, dict)
(The dict, as you recall, is the namespace that results from the SUITE being executed exactly once.)
The bases appear as the __bases__
attribute in the class, in the order
they were specified.
At the time of the class creation, Python figures out and assigns the Method Resolution Order (MRO). We’ll talk more about that later.
The way that I think of it is that just like an instance of a class falls back to the class in determining what an attribute or method is, a subclass falls back to its superclass if it doesn’t directly specify an attribute or method. Let’s walk through a simple example.
class A: a = "a"
class B(A): b = "b"
class C(B): c = "c"
c = C()
c.c # -> "c", from the C class.
c.b # -> "b", from the C's B class.
c.a # -> "a", from the C's B's A class.
c.d = "d"
c.d # -> "d", from the c instance.
And as you might expect, you can override attributes and methods just by specifying your own attribute at the appropriate class. Here’s an example of that.
class A: name = "A"
class B: name = "B"
class C: name = "C"
a = A(); a.name # -> "A" from A.name
b = B(); b.name # -> "B" from B.name
c = C(); c.name # -> "C" from C.name
del C.name
c.name # -> "B" from C's B.name
In my head, I have an image of a stack of attribute namespaces. Python will look through the stack from the top, and then return the first attribute it sees.
(The instance is on top, with its class right below it. Below the class are its super classes, in MRO.)
Method Resolution Order¶
There is a bit of a problem, however, when you consider that types may depend on other types, and those types may depend on the same types.
If we were to draw shapes on how types can inherit or derive from each other, the following shapes are possible:
Spreads out into a tree. No two classes derive from the same classes.
A tree but with diamonds. Some of the classes derive from the same classes.
Of note, circular paths are not possible. IE, A -> B -> C -> A This is because you can’t define a class as a superclass until it is actually created, and you can’t change its inheritance after it has been created.
The question arises which classes do we look at first, and in what order? This is the problem of Method Resolution Order, abbreviated MRO. In short, we are turning an arbitrary graph that follows the rules above into a straight line that visits all of the nodes on the graph in a particular order.
Python uses the C3 MRO algorithm, which isn’t very easy to describe but gives more-or-less obvious results. The general rules it tries to satisfy are:
Visit classes in the order they appear in the bases. IE, if your bases are
(A, B, C)
it will visitA
beforeB
and thenC
.If two classes inherit from the same superclasses, visit the subclasses before visiting any superclasses. This is a little bit more complicated to describe, but works as follows. Say you have
A
which inherits fromB
andC
which both inherit fromD
. Rather than visitingA
thenB
thenD
and thenC
, it will visitA
thenB
thenC
. and thenD
. That’s because it will visit the superclasses after it has visited all the subclasses.If two classes inherit from the same superclasses but in a different order, this is an error. You cannot define such a class. A simple example of this is you have
A
which derives fromB
andC
. However,B
derives fromD
and thenE
, whileC
derives fromE
andD
. Since it would violate the first rule no matter whether it visitsD
thenE
or the other way around, this is illegal. (It raises a
You can check out the method resolution order by lookin at the __mro__
attribute of the class.
Let’s look at some examples.
This shows that it will visit subclasses before their superclasses.
>>> class D: pass
>>> class B(D): pass
>>> class C(D): pass
>>> class A(B, C): pass
>>> A.__mro__
(<class '__main__.A'>, <class '__main__.B'>, <class '__main__.C'>, <class '__main__.D'>, <class 'object'>)
This example will raise a TypeError
with a message about consistent MRO.
>>> class D: pass
>>> class E: pass
>>> class B(D,E): pass
>>> class C(E,D): pass
>>> class A(B,C): pass
Traceback (most recent call last):
File "<pyshell#57>", line 1, in <module>
class A(B,C): pass
TypeError: Cannot create a consistent method resolution
order (MRO) for bases D, E
The key takeaways from this is that Python chooses a really good MRO that shouldn’t surprise anyone. The key is that you must choose a consistent ordering of superclasses if you use multiple inheritance.
Some people don’t like to use multiple inheritance, but I don’t see a reason why you shouldn’t use it. If it fits, it fits.
object¶
All classes derive from the object
type. This is always included and
contains some rather useful methods.
isinstance¶
isinstance
will walk the bases to find any base that matches the type you
pass in. So in essence, you’re asking, “Is this object of type X or any type
that derives from X?”
super¶
Sometimes you override a method, but you want to fall back to the superclass’s method in specific circumstances. A simple example might be you want to convert the value to a float or something before calling the superclass’s method.
super()
does exactly that for you.
class B:
def foo(self, param):
return param
class A(B):
def foo(self, param):
param = float(param)
return super().foo(param)
How does super() work? It’s a bit magical, but it looks up the next class in
the MRO that has the foo
method, and calls that.
Before we had a fully functional super()
, we had to remember what the base
classes were and call the methods directly, or rely on super(instance,
class)
to figure it out or us. Nowadays things are much nicer than they
were!
Typically, you’ll be using super()
with the __init__
method. That’s
because you almost always need to call the __init__
of the superclasses to
make sure everything is initialized properly.
You might also call super()
for the special methods so that it falls back
to previous implementations. For instance, if I overrode
__getattr__
, I might want to fall back to the superclass’s version.
class B:
def __getattr__(self, name):
if name == 'b': return 'b'
else: raise AttributeError("No such attribute {!r}".format(name))
class A(B):
def __getattr__(self, name):
if name == 'a': return 'a'
else: return super().__getattr__(name)
A().a # -> 'a'
A().b # -> 'b'
A().c # -> AttributeError
super() and Descriptors¶
There is a special edge case that shouldn’t be too surprising when you invoke
super()
in relation to descriptors.
Suppose we have Owner
, a class with a descriptor in the desc
attribute. If you have Owner
as a superclass and super()
is looking
for a method with the name desc
, then it will invoke __get__
on the
descriptor. However, when it invokes __get__
, the class parameter will not
be Owner
– it will be the class of the instance.
If you don’t understand what that means, it’s probably not important. When you
actually run into this, super()
will do the least surprising thing.
Law of Demeter¶
There is a useful rule of thumb that I have used and it has greatly simplified
how my code reads. A problem arises when you have six or seven things in an
attribute access / write, For instance, a.b.thing.name.title.language
.
This can get really confusing, because the person reading and writing the code
needs to know what an a.b
is, what an a.b.thing
is, and what an
a.b.thing.name
is, and so on.
In order to simplify the code, as well as to simplify the interface, the Law of Demeter was introduced and I find it is quite useful.
The Law of Demeter, sometimes called “The principle of least knowledge”, says that you should never dig so deep in an attribute chain. In terms of Python, it says that in a class, you should not access the attributes of other instances of other classes. You may only access their methods.
Let’s look at a concrete example:
class MyClass:
def __init__(self):
self.a = SomeOtherClass()
def method(self, p1, p2):
self.a # yes
self.a.some_method() # yes
self.a.some_attribute # no
p1 # yes
p1.some_method() # yes
p1.some_attribute # no
r1 = p2.some_method() # yes
r1.some_method() # no
r1.some_attribute # no
The general rule is “use only one dot”, except for self.
, in which case you
can use 2 dots – one of them being part of self.
.
The benefit of the Law of Demeter is that there are fewer bugs, because the interfaces between objects is simpler. However, it also tends to increase the number of methods, and the more methods there are the more likely there is to be a bug. So it’s a tradeoff.
In practice, I try to keep the Law of Demeter strictly, but I always find some cases where violating it a little bit reduces the complexity of the code. Specifically, in Python attributes aren’t hidden and can sometimes be reliably used.
I like to document the attributes and then only carefully change that if need be – finding every place it is used and updating it. Or just avoiding it altogether!