Methods

Written by Alex Guyer | guyera@oregonstate.edu

This lecture is about methods. We'll cover the following:

Basic methods

Attributes allow us to establish "has-a" relationships. For example, every city has a name, and every city has a population. See the POD types lecture notes for a reminder about attributes.

But, as I mentioned in the POD types lecture, POD types represent the most limited use case of classes. Indeed, classes can be used to create much more complicated data types than just POD types.

For one, classes can have methods. A method is a function that exists within instances of a class, just as an attribute is a variable that exists within instances of a class. In some sense, if an attribute represents a "has-a" relationship, then a method represents a "can" relationship. For example, a Dog class might have a bark() method. In such a case, barking would be something that all Dog instances can do (dogs can bark, hence the "can" relationship).

Let's start simple. In our modules and packages lecture, we created a Dog class and a print_dog() function. It looked like this:

dog.py
class Dog:
    name: str
    birth_year: int

def print_dog(dog: Dog) -> None:
    print(f'{dog.name} was born in {dog.birth_year}')

If we wanted to print a Dog instance, we would pass it as an argument to the print_dog() function. For example:

main.py
from animals.dog import Dog, print_dog

def main() -> None:
    spot = Dog()

    spot.name = 'Spot'
    spot.birth_year = 2022

    print_dog(spot)

if __name__ == '__main__':
    main()

But suppose we want to change our representation a bit. Rather than having a global function that prints a given Dog instance, suppose we want Dog instances to be able to print their own information to the terminal. That's to say, suppose we want to establish the "can" relationship: dogs can print their information to the terminal.

To do this, we might create a method named print() within the Dog class. The syntax for defining a class method is as follows:

def <name>(self, <parameter 1>, ..., <parameter N>) -> <return type>:
    <method body>

Importantly, method definitions (as above) must be placed inside class definitions, similar to attribute declarations.

You might have noticed that the above syntax looks very similar to a regular function definition. That shouldn't be surprising; methods are just functions that exist inside class instances. Anyways, replace <name> with the method's name, replace each <parameter X> with a valid function parameter declaration (e.g., x: int, if you want the method to have an integer parameter named x), replace <return type> with the method's return type, and replace <method body> with the block of code that you want to be executed when the method is called.

You might be wondering what self means in the above syntax. Well, in order to explain that (and before I can provide a fully functional example of a method), I first have to explain how methods are called. Suppose that we define a print() method within the Dog class. First, remember that the dot operator can be used to access things that exist inside class instances. For example, to access the name attribute within the Dog instance spot, we would write spot.name (as in main.py above). Second, remember that methods are functions that exist inside class instances. Putting these two facts together, it stands to reason that, in order to access and call the print() method within the Dog instance spot, we would write something like spot.print() (possibly passing in some arguments, if applicable). And indeed, that's precisely how you access methods.

This often confuses students, so I'll try to explain it in another way: methods exist inside class instances, so if you don't have a class instance, then you don't have a method. Indeed, if the Dog class defined a print() method, we couldn't simply call it via print(). That wouldn't make any sense; print(), being a method of the Dog class, exists inside Dog instances. It does not exist as a standalone function. So, in order to use it, we first have to create a Dog instance, such as spot. Just as spot has a name attribute and a birth_year attribute, spot also has a print() method (or, equivalently, "spot can print"). To access spot's name, we write spot.name. By the same token, to call spot's print() method, we write spot.print() (again, possibly passing in some arguments, if applicable).

The implication here is that every time you call a method, you must access it from within some particular class instance using the dot operator. Something like spot.print(), or fluffy.print(), or bella.print()never just print().

How does this all relate to self? Well, when you call a method, you're always calling it "on" a specific class instance. That instance is sometimes, in the context of some programming languages, referred to as the calling object (this is not common terminology in the context of Python, but it's a useful term, so I'm going to use it anyways). For example, if we call spot.print(), then spot is the calling object. If we call fluffy.print(), then fluffy is the calling object. And so on. Here's the kicker: whenever you call a method, the calling object is implicitly and automatically passed in as a sort of "hidden argument" to the method. Hence, the method must have a corresponding parameter to serve as a placeholder for that implicit argument. This is what self is. It's just a parameter that, in the context of the method body, refers to the calling object.

It doesn't even technically need to be named self. You can actually name it whatever you want. However, it's unconventional to name it anything other than self, so please don't do that. Moreover, understand that the calling object is always implicitly passed as the first argument to the method call, so the first parameter will always refer to the calling object. This means that every method you ever create (in Python) should always have a self parameter, and it should always be the first parameter of the method (and sometimes the only parameter of the method). Hence the above syntax.

This is also why the self parameter does not need a type annotation (though can, technically, type-annotate it if you'd like). Since it's always a reference to the calling object, Mypy already knows what its type is.

Finally, that's enough information to show you an actual example. Let's get rid of the print_dog() function and replace it with a print() method that's defined inside the Dog class:

dog.py
class Dog:
    name: str
    birth_year: int

    # Methods are defined inside classes, so we define the Dog
    # print() method here. Make sure that it's indented over so that
    # it's recognized as belonging to the class.
    def print(self) -> None:
        # self refers to the calling object. For example, if we call
        # spot.print(), then self refers to spot (spot is implicitly
        # passed in as an argument to the self parameter). Hence,
        # if we print self.name, that will print the name attribute
        # of the calling object. And if we print self.birth_year,
        # that will print the birth_year attribute of the calling
        # object. Therefore, spot.print() will print spot's information
        # to the terminal, whereas fluffy.print() will print fluffy's
        # information to the terminal, and so on.
        print(f'{self.name} was born in {self.birth_year}')

Now, suppose we have a Dog instance named spot, such as in main.py. Previously, if we wanted to print spot's information to the terminal, we would write print_dog(spot), thereby passing spot as an argument to the print_dog() function. Now, we would instead write spot.print(). This executes the print() method of the Dog class, but specifically the one that exists inside spot. This, in turn, passes in spot as the argument to the self parameter of the print() method. When it then prints self.name and self.birth_year (see the above code), it will be printing spot's name and birth year.

Let's update main.py accordingly:

main.py
from animals.dog import Dog

def main() -> None:
    spot = Dog()

    spot.name = 'Spot'
    spot.birth_year = 2022

    # Instead of print_dog(spot), we now write spot.print()
    spot.print()

if __name__ == '__main__':
    main()

Running the above program produces the following output:

(env) $ python main.py 
Spot was born in 2022

Constructors (__init__() methods)

So, what the heck is the point? What was wrong with print_dog(spot)? Why did we have to introduce a whole new language feature just so that we could instead write spot.print()? Well, there are many very good reasons that methods existthings that they allow you to do that otherwise wouldn't be possible. We'll be covering many of these things incrementally throughout the course, but we'll start with a simple one: constructors.

A constructor is a special method that's automatically called on a class instance the moment it's first created. The purpose of a constructor is to "set up" the instance, initializing its attributes to a valid state.

For example, in main.py above, we create a Dog instance via spot = Dog(). But as we saw in the POD types lecture, spot is not immediately initialized to a valid state. Rather, from the moment spot is created within main(), its attributes are undefined until we explicitly define them (spot.name = 'Spot', and spot.birth_year = 2022). If we forget to initialize spot's attributes, they'll remain undefined. Suppose we then call spot.print(). That would result in a runtime error when the print() method attempts to print an undefined attribute:

main.py
from animals.dog import Dog

def main() -> None:
    spot = Dog()

    # Suppose we forget to initialize spot's attributes

    # Then spot.print() will throw an AttributeError when it tries
    # to print undefined attributes
    spot.print()

if __name__ == '__main__':
    main()

Running the above program produces the following output:

(env) methods $ python main.py 
Traceback (most recent call last):
  File "/home/alex/instructor/static-content/guyera.github.io/code-samples/methods/main.py", line 13, in <module>
    main()
    ~~~~^^
  File "/home/alex/instructor/static-content/guyera.github.io/code-samples/methods/main.py", line 10, in main
    spot.print()
    ~~~~~~~~~~^^
  File "/home/alex/instructor/static-content/guyera.github.io/code-samples/methods/animals/dog.py", line 18, in print
    print(f'{self.name} was born in {self.birth_year}')
             ^^^^^^^^^
AttributeError: 'Dog' object has no attribute 'name'

"So just don't forget to initialize your attributes," I hear you say. But I often say: a good interface serves to prevent mistakes. Forgetting to initialize an attribute is a particularly common mistake. If we define a constructor for the Dog class that initializes each Dog instance's attributes to some specified values, then such mistakes become impossible. That would be a much better interface.

In Python, a constructor is defined as a method with the name __init__ and a return type of None (it must have that exact name, including the double underscores at the beginning and end). Besides that, it's defined just like any other method; it must have a self parameter, and it can optionally have additional parameters.

In our case, the constructor's job is to initialize the name and birth_year attributes of each newly created Dog instance, so it needs to know what values it should initialize those attributes to. This is the purpose of parameters in a constructorto give the constructor some information about the instance that's being created.

Let's add a constructor to our Dog class:

dog.py
class Dog:
    name: str
    birth_year: int

    def __init__(self, n: str, b: int) -> None:
        # n is the dog's name, and b is the dog's birth year.
        # Store them in self.name and self.birth_year
        self.name = n
        self.birth_year = b

    def print(self) -> None:
        print(f'{self.name} was born in {self.birth_year}')

The moment a Dog instance is created, the Dog constructor will automatically be called on that instance. This means that, in the context of a constructor, self refers to the instance that's currently being created. The n parameter specifies the newly created dog's name, and the b parameter specifies its birth year. The constructor then stores those values into the respective name and birth_year attributes of the Dog instance (accessible via self.name and self.birth_year).

But how do we call this method? Well, I mentioned that a class's constructor is automatically called on an instance of the class the moment it's created. In main.py, we create spot via spot = Dog(). The righthand side of that assignment operatorDog()is, in fact, a constructor call. You might remember that I left a comment in the POD types lecture notes stating that class instantiation syntax looks similar to a function call. Indeed, it is a function call. Dog() calls the Dog class's constructor, and that's exactly how Dog instances are created. (Technically, even if you don't define a constructor for a class, the class still has a constructor; it just accepts no arguments and does nothing).

Currently, we're not passing in any arguments to the Dog constructor when we construct spot. But now, we've defined the Dog constructor to have two parameters (in addition to self): 1) n, the dog's name, and 2) b, the dog's birth year. Hence, when we call the Dog constructor to construct spot, we must pass in two arguments corresponding to these parameters. Let's update main() accordingly:

main.py
from animals.dog import Dog

def main() -> None:
    # Call the Dog constructor to create spot, passing in "Spot"
    # as the name and 2022 as the birth year. The constructor will
    # then store these values within spot.name and spot.birth_year
    spot = Dog('Spot', 2022)

    # Print spot's information to the terminal
    spot.print()

if __name__ == '__main__':
    main()

Now, it's essentially impossible to create Dog instances that aren't immediately initialized to a valid statein order to create a Dog instance, we must specify its name and birth year, and the constructor automatically stores those values in the instance's appropriate attributes. And if we made a mistake in the constructor call (e.g., spot = Dog(), leaving out the arguments), that would be detected immediately by Mypy, making such mistakes extremely easy to locate and diagnose.

Objects

To describe spot, I've been using the terms "class instance" and "variable" a lot. But there's actually another term that's commonly used to describe things like spot: objects.

The exact definition of an object depends on the context and who you ask. One possible definition simply states that an object is an instance of a class (i.e., it's synonymous with "class instance"). A slightly more rigorous definition requires objects to have both state (attributes, or "has-a" relationships) and behavior (methods, or "can" relationships). This would mean that spot refers to an object since it has attributes and methods, but primitives (e.g., integers) and instances of POD types are not objects because they don't have state / methods. When we discuss object-oriented programming, this is the definition that we'll go withsomething with both state and behavior.

But there's another much looser definition that states that objects are simply values, regardless of the types of those values. Under this definition, even primitives like integers, floats, and doubles are considered to be objects. This is actually the definition that's used by the Python language specification. In Python, technically, everything is an object. This is a somewhat uncommon definition, though, and the reason that Python takes this stance has to do with how Python stores, identifies, and references values under the hood. But we'll save that for a future lecture.