August 15, 2022

Common python mistakes #1: mixing up instance and class members

By Natan Yellin, Robusta.dev co-founder

Don't mix up Python class variables and instance variables

Here is a common Python mistake we see in PRs for robusta.dev on GitHub:

class Person:
  name: str

  def __init__(self, name):
    self.name = name

Don't do this!

There are two variables defined in the above code.

First, there is a class variable name: str on line 2.

Second, there is an instance (member) variable self.name on line 5.

Those two variables have nothing to do with one another.

They happen to share the same name, but that's all. The instance variable on line 5 overrides the class variable on line 2.

Python Class Variables vs Instance Variables

The above example will run fine. The danger is with more complicated code like this:

class CoffeeMachine:

  machine_logs: List[str] = []
  bean_type: str

  def __init__(self, bean_type="Robusta"):
    self.bean_type = bean_type

  def pour_cup(self):
    self.machine_logs.append(f"pouring one cup of {self.bean_type}")

In the above code, I forgot to initialize machine_logs as an instance variable in __init__. Normally using machine_logs in pour_cup would throw an exception, but because machine_logs is a class variable, everything works fine.

If you use many CoffeeMachines, constantly creating and destroying them, you'll have a memory leak. machine_logs is a class variable and it constantly grows as it is shared between all coffee machines. Maybe this is what you wanted, maybe not. It's hard to tell when you're confusing class variables and instance variables.

By the way, if you want to debug Python memory leaks on Kubernetes, robusta.dev is the best way to do so and it's open source.

When are class members not class members?

There is a huge exception to the above, which confuses things even more.

In Robusta we make heavy use of the Pydantic library. What is Pydantic? It's an incredibly useful Python library that lets autogenerates constructors and validators for your classes based on type annotations:

from pydantic import BaseModel

class CoffeeMachine (BaseModel):
  bean_type: str = "Robusta"

cm = CoffeeMachine(bean_type="arabica")
cm2 = CoffeeMachine(bean_type=1) # throws an exception because 1 is not a string

Now that we've explained what Pydantic is, lets get back to discussing Python class variables. In the above example, bean_type appears to be a class variable. So what does the following print?

>>> a = CoffeeMachine(bean_type="arabica")
>>> b = CoffeeMachine()
>>> print(a.bean_type, b.bean_type)
arabica Robusta

Huh. It's not a class variable after all. Each instance of CoffeeMachine has it's own copy of bean_type.

The obvious reason for this is Pydantic. With Pydantic you use class members to define a schema for your class. Then Pydantic generates the actual class with instance members that have the same name as your class members.

Pydantic does this with metaclasses, but that's a topic for another day.

If you use Python dataclasses then the exact same thing happens.

So ignore the first half of this post if you use Pydantic or dataclasses.

Closing thoughts

I love Pydantic, but mixing Pydantic and non-Pydantic classes in the same codebase can lead to confusion with developers new to Python. I hope this post clears things up.

Lastly, if you need to debug Python applications running on Kubernetes then you should be using Robusta.dev.

With Robusta.dev you can troubleshoot any Python application without restarting it or setting anything up in advance.

Specifically, you can:

1. Run CPU profilers and see what functions are using CPU

2. Debug memory leaks by seeing what was allocated and not freed

3. Attach non-breaking debuggers to code in production

Images for all that are below:

See the docs for details.

Subscribe to receive articles directly in your inbox