Common python mistakes #1: mixing up instance and class members

updated on 12 April 2022

By Natan YellinRobusta.dev co-founder

Here is a common Python mistake we see in PRs for robusta.dev on GitHub:

class Person:
  name: str

  def __init__(self, name):
    self.name = name

Don't do this!

There are two variables defined in the above code.

First, there is an class variable name: str on line 2. 

Second, there is an instance (member) variable self.name on line 5.

Those two variables have nothing to do with one another.

They happen to share the same name, but that's all. The instance variable on line 5 overrides the class variable on line 2.

Why does this matter?

The above example will run fine. The danger is with more complicated code like this:

class CoffeeMachine:

  machine_logs: List[str] = []
  bean_type: str

  def __init__(self, bean_type="Robusta"):
    self.bean_type = bean_type

  def pour_cup(self):
    self.machine_logs.append(f"pouring one cup of {self.bean_type}")

In the above code, I forgot to initialize machine_logs as an instance variable in __init__. Normally using machine_logs in pour_cup would throw an exception, but because machine_logs is a class variable, everything works fine.

If you use many CoffeeMachines, constantly creating and destroying them, you'll have a memory leak. machine_logs is a class variable and it constantly grows as it is shared between all coffee machines. Maybe this is what you wanted, maybe not. It's hard to tell when you're confusing class variables and instance variables.

By the way, if you want to debug Python memory leaks on Kubernetes, robusta.dev is the best way to do so and it's open source.

When are class members not class members?

There is a huge exception to the above, which confuses things even more.

In Robusta we make heavy use of the Pydantic library. We often have code like this:

from pydantic import BaseModel

class CoffeeMachine (BaseModel):
  bean_type: str = "Robusta"

bean_type is a class member, right? So what does the following print?

>>> a = CoffeeMachine(bean_type="arabica")
>>> b = CoffeeMachine()
>>> print(a.bean_type, b.bean_type)
arabica Robusta

Huh. It's not a class member after all. Each instance of CoffeeMachine has it's own copy of bean_type.

The obvious reason for this is Pydantic. With Pydantic you use class members to define a schema for your class. Then Pydantic generates the actual class with instance members that have the same name as your class members.

Pydantic does this with metaclasses, but that's a topic for another day.

If you use Python dataclasses then the exact same thing happens.

So ignore the first half of this post if you use Pydantic or dataclasses.

Closing thoughts

I love Pydantic, but mixing Pydantic and non-Pydantic classes in the same codebase can lead to confusion with developers new to Python. I hope this post clears things up.

Lastly, if you need to debug Python applications running on Kubernetes then you should be using Robusta.dev.

With Robusta.dev you can troubleshoot any Python application without restarting it or setting anything up in advance.

Specifically, you can:

1. Run CPU profilers and see what functions are using CPU

2. Debug memory leaks by seeing what was allocated and not freed

3. Attach non-breaking debuggers to code in production

Images for all that are below:

python_memory1-tn6b5
python_debugger1-5xip1
python-profiler1-hjbcv

See the docs for details.

Read more