A Strong Reference to Weak References in Python
Understanding Python’s memory management with weak references
When working with Python (and many other languages), you often rely on the runtime to manage memory for you. Most of the time this works invisibly, but certain patterns such as objects that reference each other in cycles, long lived caches, or subscriber lists can create memory leaks if not handled carefully.
This happens because Python always creates strong references to objects, which means the object will be kept alive as long as all such strong references exist in the program. But when used in cyclic data structure, or in caches, these strong references can unnecessarily delay the deallocation of these objects.
Weak references provide a way to refer to objects without preventing them from being garbage collected. They let you build caches that automatically empty, subscriber lists that clean themselves up, and other data structures that will not accidentally extend object lifetimes.
In this article we will explore what weak references are, why they matter, and how to use them in Python. We will start with a review of reference counting, look at its limitations, and then dive into weak references and their practical uses.
CodeRabbit: Free AI Code Reviews in CLI (Sponsored)
As developers increasingly turn to CLI coding agents like Claude Code for rapid development, a critical gap emerges: who reviews the AI-generated code? CodeRabbit CLI fills this void by delivering senior-level code reviews directly in your terminal, creating a seamless workflow where code generation flows directly into automated validation. Review uncommitted changes, catch AI hallucinations, and get one-click fixes - all without leaving your command line. It’s the quality gate that makes autonomous coding truly possible, ensuring every line of AI-generated code meets production standards before it ships.
A Review of Reference Counting
Many languages either use reference counting as a mechanism to manage runtime memory or they provide first class primitives to do use reference counting.
In this scheme, every object has an associated reference count which means the number of places it is being used. For example, when you create an object and assign it to a variable it will have a reference count of 1. When you assign it to another variable or pass it to another function, its reference count will go up by 1.
Similarly, when a variable goes out of scope, or a function call returns then its reference count gets decremented. If the reference count of the object reaches 0, it gets deallocated or garbage collected.
CPython uses reference counting for managing the memory of its runtime. But other languages also offer it as well. For example, in C++ or rust when you use a smart pointer, it uses reference counting under the hood, the compiler generates code that increments and decrements the reference count of the objects.
If you want to understand how CPython implements reference counting internally, you can check out my article on that topic:
How CPython Implements Reference Counting: Dissecting CPython Internals
This week we are diverting from AI and machine learning to discuss a more intense CS topic — memory management in Python. Memory management refers to the techniques used by the programming language runtime to allocate and free memory as programs execute. Understanding how memory management functions in a language is crucial to writing efficient and high…
Limitations of Reference Counting
Reference counting works well for most cases, but it is not a complete solution. Its simplicity comes with trade‑offs, and understanding these limitations helps motivate why Python also offers weak references.
One of those limitations is cyclic references. Cyclic references exist when objects hold references to each other in a cycle, e.g. in a graph data structure. But you can also end up creating cyclic references accidentally in complex systems. In such cases, the objects that are part of the cycle will never get freed until the cycle is broken. This is why CPython also implements a cycle breaking garbage collector (GC) that runs periodically, scans the objects for cycles and if it detects cycles that are no longer referenced from anywhere else, then it breaks them so that those objects can be freed.
Cyclic references can be problematic for performance because memory usage remains high until the GC runs, and the GC scan itself can be expensive (depending on the number of objects it needs to scan).
We can understand this with the help of an example. Consider the following code
Let’s break it down:
The
MyNode
class implements a linked list node with a next field.print_node_objects
is a utility function. It finds all theMyNode
objects that are currently alive and then prints their referrers, i.e., who is holding a reference to them.It uses
gc.get_objects()
to get the list of all the currently alive objects in the Python interpreter and filters it down by checking for their type and selecting onlyMyNode
type objects.It finds the referrers to an object by using the
gc.get_referrers()
method which returns a list of referrer objects. We are filtering this list by type because during the call, the gc module itself becomes a referrer and we want to filter it away.
In the main function we call the
test1()
function that creates twoMyNode
objects, prints their reference counts and returns. After returning fromtest1
, we callprint_node_objects()
to see if there are anyMyNode
type objects that are still alive.
If you run this program, you should see an output like the following:
➜ uv run --python 3.13 -- cycles.py
n1 refcount: 2
n2 refcount: 2
n1 is being deleted
n2 is being deleted
No MyNode objects found
This is pretty much the expected output, but let’s spend a moment to ensure we don’t miss anything.
We see that the reference count for both
n1
andn2
is 2. You might expect it to be 1 but it is 2 because during the call tosys.getrefcount
, the object’s reference count gets incremented.We see that the
__del__
method of both the object gets called and prints a message. This happens becausen1
andn2
are local variables insidetest1(),
and when it returns, its stack frame gets destroyed which results in the reference counts of all of its local objects (parameters and locally created variables) being decremented. In this case, becausen1
andn2
reached reference count 0, they were deallocated and their__del__
method was called.Finally, in
main()
, whenprint_node_objects()
is called, we see that it does not find anyMyNode
objects on the heap that are still alive.
Next, we can do another test that creates a cycle between n1
and n2
and see that the objects stay alive after the return from the test function. The following figure shows the updated code where I’ve added a new function test2()
and then calling it from main.
![#!/usr/bin/env python import gc import sys class MyNode: def __init__(self, name: str): self.name: str = name self.next = None def __del__(self): print(f"{self.name} is being deleted") def print_node_objects(): obj_count = 0 for o in gc.get_objects(): if type(o) is MyNode: obj_count += 1 print(f"{o.name} exists with referrers: {[n.name for n in gc.get_referrers(o) if type(n) is MyNode]}") if obj_count == 0: print("No MyNode objects found") def test2(): n1 = MyNode("n1") n2 = MyNode("n2") n1.next = n2 n2.next = n1 print(f"n1 refcount: {sys.getrefcount(n1)}") print(f"n2 refcount: {sys.getrefcount(n2)}") def test1(): n1 = MyNode("n1") n2 = MyNode("n2") print(f"n1 refcount: {sys.getrefcount(n1)}") print(f"n2 refcount: {sys.getrefcount(n2)}") if __name__ == '__main__': test1() print_node_objects() print("---------------------") test2() print_node_objects() #!/usr/bin/env python import gc import sys class MyNode: def __init__(self, name: str): self.name: str = name self.next = None def __del__(self): print(f"{self.name} is being deleted") def print_node_objects(): obj_count = 0 for o in gc.get_objects(): if type(o) is MyNode: obj_count += 1 print(f"{o.name} exists with referrers: {[n.name for n in gc.get_referrers(o) if type(n) is MyNode]}") if obj_count == 0: print("No MyNode objects found") def test2(): n1 = MyNode("n1") n2 = MyNode("n2") n1.next = n2 n2.next = n1 print(f"n1 refcount: {sys.getrefcount(n1)}") print(f"n2 refcount: {sys.getrefcount(n2)}") def test1(): n1 = MyNode("n1") n2 = MyNode("n2") print(f"n1 refcount: {sys.getrefcount(n1)}") print(f"n2 refcount: {sys.getrefcount(n2)}") if __name__ == '__main__': test1() print_node_objects() print("---------------------") test2() print_node_objects()](https://substackcdn.com/image/fetch/$s_!8ocV!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04a29c40-1783-4914-8226-a15496232c18_1081x920.png)
If we run this program, we should see the following output:
➜ uv run --python 3.13 -- cycles.py
n1 refcount: 2
n2 refcount: 2
n1 is being deleted
n2 is being deleted
No MyNode objects found
---------------------
n1 refcount: 3
n2 refcount: 3
n1 exists with referrers: [’n2’]
n2 exists with referrers: [’n1’]
n1 is being deleted
n2 is being deleted
Let’s focus on the output after the call to test2()
.
We see that in
test2()
, the reference count forn1
andn2
is 3, one higher than what it was intest1()
. This is due ton1.next
creating a reference ton2
andn2.next
creating a reference ton1
.We also see that when
test2()
returns, the__del__
method ofn1
andn2
is not called, it means that those objects are not deallocated and are still alive. This happened because during the return, the interpreter would decrement their reference count but this time the reference count does not reach 0.After return from
test2()
, when we callprint_node_objects()
, we see that it tells us that theMyNode
objects we created forn1
andn2
are still alive. We can also see that they are alive because they are holding cyclic reference to each other.n1
andn2
finally get destroyed as the program ends because the CPython interpreter runs the GC before shutting down.
To avoid such cyclic references from leaking memory, CPython includes a garbage collector that periodically runs, detects cycles that are no longer from anywhere else, and breaks them so that the objects that are part of the cycle can get deallocated. You can verify it yourself by inserting a gc.collect()
call after the call to test2()
in the above program.
If you want to understand how the CPython garbage collector detects and breaks cycles, read my article on its internals:
CPython Garbage Collection: The Internal Mechanics and Algorithms
We’ve been talking about CPython internals and in the last article I went quite deep in CPython’s runtime. One of the crucial services that the runtime provides is that of managing a program’s memory during execution.
However, there are other ways to avoid such pitfalls of reference counting and weak references is one of them. Let’s understand what they are and how they work.
Understanding Weak References
Weak references are on the opposite spectrum of strong references. A weak reference does not increase the reference count of the underlying object, so it enables you to use an object without prolonging the lifetime of the object.
When the object’s reference count goes to 0, it can get deallocated even if there are weak references to it that are still being used. Naturally, this requires that when using a weak reference to an object, we always need to check if the underlying object is still alive.
In Python, to create weak references, we need to use the weakref.ref()
function from the weakref module and pass the object for which we want to create a weak reference. For example:
n1_weakref = weakref.ref(n1)
weakref.ref()
creates a weak reference to the given object and returns us a callable. To access the underlying object we need to invoke this callable everytime. If the object is still alive, it returns a handle to the object, otherwise it returns None
. For example:
if n1_weakref():
print(f"name: {n1_weakref().name}")
else:
print("n1 no longer exists")
The following figure shows a full example of creating a weak reference and accessing it in our running linked list example.
Output:
➜ uv run --python 3.13 -- weakref_cycles.py
n1 refcount: 2
n1 refcount: 2
n1’s name: n1
n1 is being deleted
n1 no longer exists
---------------------
No MyNode objects found
From the output we can confirm a few things:
Creating a weak reference does not increase the object’s reference count
A weak reference does not prevent the object from being deallocated if its reference count goes to 0 (in the example we deleted n1 and after that we were not able to access it using the weak reference.).
I leave the problem of fixing the cyclic reference that we created in test2()
as an exercise for you.
Other Use Cases of Weak References
So far we’ve seen weak references as a tool for avoiding cycles, but their utility goes well beyond that. The weakref
module also provides ready-made containers built on top of weak references. These containers, WeakValueDictionary
and WeakSet
, help you manage auxiliary data structures that should not extend the lifetimes of their contents. They solve practical problems such as caching, registries, and subscriber lists, where automatic cleanup is not just convenient but essential for avoiding leaks.
WeakValueDictionary
The weakref
module provides WeakValueDictionary
, which looks and behaves like a normal dictionary but with an important twist: its values are held only through weak references. If a value is no longer strongly referenced anywhere else, the dictionary entry disappears automatically.
This makes WeakValueDictionary
a natural fit for caching and memoization. Imagine you compute expensive results or load large data structures and want to reuse them if they are still in memory. At the same time, you don’t want the cache itself to keep them alive forever. A WeakValueDictionary
strikes that balance: it holds onto results only as long as the rest of the program does.
Another classic application is object interning or registries. For example, you may want to ensure there is only one canonical object representing a resource (like a symbol table entry, database connection, or parsed schema). By using a WeakValueDictionary
, you avoid artificially extending the lifetimes of those resources.
Here’s a simple illustration:
import weakref
class Data:
def __init__(self, name):
self.name = name
def __repr__(self):
return f"Data({self.name})"
cache = weakref.WeakValueDictionary()
obj = Data("expensive_result")
cache["key"] = obj
print("Before deletion:", dict(cache))
# Drop the strong reference
obj = None
print("After deletion:", dict(cache))
Output:
Before deletion: {'key': Data(expensive_result)}
After deletion: {}
Notice how the cache entry vanishes automatically once the last strong reference goes away. There is no need for manual cleanup. Under the hood, this is implemented with weakref callbacks—the same mechanism we’ll see in the callback section.
WeakSet
Another container provided by the weakref
module is WeakSet
. This is similar to a regular set
, except that it holds weak references to its elements. If an object is garbage collected, it will automatically vanish from the set.
One scenario where this is very handy is when you want to keep track of subscribers, observers, or listeners. These are objects that register interest in events produced by another object (often called the publisher). For instance:
GUI frameworks: widgets listen to events such as theme changes or window resizes.
Event buses: services subscribe to log events, metrics, or domain events.
Plugin systems: plugins register callbacks at load time to respond to hooks.
Background services: transient sessions (e.g., WebSocket connections) listen for updates from a long‑lived manager.
In all these cases, subscribers are often short‑lived, while the publisher lives much longer. Using a regular set
to hold them risks memory leaks, because a strong reference in the set will keep the subscriber alive even when the rest of the program has forgotten it. With a WeakSet
, the garbage collector automatically removes subscribers that are no longer strongly referenced anywhere else, so you don’t need explicit unsubscribe logic in every shutdown path.
Here’s a simple example:
import weakref
class Listener:
def __init__(self, name):
self.name = name
def __repr__(self):
return f"Listener({self.name})"
listeners = weakref.WeakSet()
l1 = Listener("A")
l2 = Listener("B")
listeners.add(l1)
listeners.add(l2)
print("Before deletion:", list(listeners))
# Remove one listener
l1 = None
import gc; gc.collect()
print("After deletion:", list(listeners))
Output:
Before deletion: [Listener(A), Listener(B)]
After deletion: [Listener(B)]
This pattern is often extended into a publisher–subscriber model:
class Publisher:
def __init__(self):
self._subs = weakref.WeakSet()
def subscribe(self, sub):
self._subs.add(sub)
def notify(self, payload):
for s in list(self._subs):
s.handle(payload)
class Subscriber:
def __init__(self, name):
self.name = name
def handle(self, payload):
print(self.name, "got:", payload)
pub = Publisher()
sub = Subscriber("one")
pub.subscribe(sub)
pub.notify({"event": 1}) # delivered
sub = None # drop last strong ref
import gc; gc.collect()
pub.notify({"event": 2}) # nothing printed; WeakSet cleaned itself
Using WeakSet
here avoids leaks and simplifies lifecycle management. A caveat is that only weak‑referenceable objects (i.e., user‑defined classes) can be added; built‑ins like int
or tuple
won’t work. If your class uses __slots__
, include __weakref__
to allow weak references.
Callbacks on Weak References
Another useful feature of weakref.ref
is the ability to attach a callback. A callback is a function that gets invoked automatically when the referent object is about to be finalized. This can be handy if you want to clean up auxiliary data structures or release resources when an object goes away.
import weakref
class Resource:
def __init__(self, name):
self.name = name
def __repr__(self):
return f"Resource({self.name})"
def on_finalize(wr):
print("Resource has been garbage collected:", wr)
obj = Resource("temp")
wr = weakref.ref(obj, on_finalize)
print("Created weak reference:", wr)
# Drop strong reference
obj = None
# Force GC for demo purposes
import gc; gc.collect()
Output:
Created weak reference: <weakref at 0x75f6773870b0; to ‘Resource’ at 0x75f677c4ee40>
Resource has been garbage collected: <weakref at 0x75f6773870b0; dead>
Here, the on_finalize
callback is called once the Resource
instance is about to be collected. The weak reference itself becomes dead afterwards. This pattern is useful when you want to implement custom cleanup logic tied to an object’s lifecycle.
It’s also worth noting that containers like WeakValueDictionary
and WeakSet
use this same mechanism internally: they attach callbacks to their weak references so that entries are automatically removed when the referent objects are finalized.
Conclusion
Weak references are not a tool you’ll reach for every day, but when you need them they solve very real problems. At the lowest level, weakref.ref
lets you point to an object without affecting its lifetime, and you can even attach a callback to run cleanup code at the moment it is collected. Building on that primitive, Python’s WeakValueDictionary
and WeakSet
give you higher level containers for caches, registries, and subscriber lists that automatically clean themselves up when their contents go away.
To summarize the differences:
Together, these features make it possible to build memory‑friendly systems that avoid leaks, reduce bookkeeping, and respect the natural lifetimes of your objects. Understanding weak references and knowing when to apply them will help you write code that is both safer and more efficient.
Further Reading
Python documentation on
weakref
“Fluent Python” by Luciano Ramalho – includes in depth coverage of weak references and how to use them