Python Performance: Why 'if not list' is 2x Faster Than Using len()

Discover why 'if not mylist' is twice as fast as 'len(mylist) == 0' by examining CPython's VM instructions and object memory access patterns.

Mar 12, 2025

In Python, you can check a list for emptiness in two ways:

if not mylist
if len(mylist) == 0

While the 2nd approach is not wrong, the first method is considered more Pythonic. Many people don’t agree, but I’ve already put forward my points in a previous article on that debate.

Apart from being more Pythonic, the first version is ~2x faster than the other, see for yourself:

Microbenchmark comparing performance of the two ways of doing emptiness checks in Python

That's an almost 2x performance difference for something as fundamental as checking if a collection is empty. This operation occurs millions of times in any non-trivial Python application.

The disparity becomes even more intriguing when you consider that both approaches must ultimately determine the same thing: does the collection contain any elements? What causes this surprisingly large gap? The answer lies in CPython's implementation details:

How objects are organized in memory
How the virtual machine processes bytecode instructions
The hidden costs of seemingly simple operations, such as calling the built-in len().

In this article, we'll follow both the execution paths instruction by instruction, uncovering why a small syntactic difference translates to double the execution time.

TL;DR

If you’re short on time, here are the key insights covered in the article:

In Python, checking if a collection is empty can be done with either
- if not mylist
- or, if len(mylist) == 0
- The first approach is consistently 2x faster.
This performance difference exists because “if not mylist” requires only 2 VM instructions (LOAD_GLOBAL, TO_BOOL), while “if len(mylist) == 0” needs 5 instructions (LOAD_GLOBAL, LOAD_FAST, CALL, LOAD_CONST, COMPARE_OP).
Both the TO_BOOL instruction and the len() built-in function check the size of the list through five layers of pointer indirection.
However, the VM optimizes TO_BOOL to TO_BOOL_LIST which is a specialized instruction to evaluate the truth value of list type objects, and it can, it can do the same thing in a single memory access.
Apart from five levels of pointer dereferences, using len() also has the overhead of function calls, which adds up.
For hot loops or performance-critical code, the idiomatic approach (if not mylist) is more Pythonic and measurably more efficient.

Summary of the key factors which result in the stark performance difference in doing emptiness check using truth value testing vs len()

Article Structure

In order to understand why this performance difference exists, we will need to cover the following internal details of CPython implementation.

CPython’s internal object model
Bytecode execution in the CPython VM
Internal optimizations of the VM such as instruction specialization

We will start by understanding the definition and layout of the list object in CPython. This information is vital for the rest of the article because ultimately both the approaches need to check the size of the list and that information is stored in the object.

After that we will dive into the internals of both the methods of doing emptiness checks and understand what are the expensive parts.

Finally we will summarize by doing a side-by-side comparison of the two approaches.

Understanding the Python List Object Layout

First of all, let’s understand the definition and in-memory layout of the list object in CPython. It will give us insight about how the CPython runtime tracks the size of sequence type objects.

Different object types in CPython need different fields depending on their requirements. For example, a float object needs a double type field to store its value, while a list type needs to have a dynamic array and its size to store the list data.

But, there are certain common fields that all object types need to have for the CPython runtime to manage them. These include the object reference count, the type information and the size of the object (only needed for collection types, such as lists, dicts, strings).

To make it easier for the CPython runtime to access these common fields in objects of different types, CPython implements an object hierarchy (AKA inheritance model). This ensures that even without knowing the concrete type of the object, the runtime can access these common fields whenever it needs.

The two common fields that every CPython type needs to have are its reference count (ob_refcnt) and the type (ob_type) related information. These are kept in a struct called PyObject which is embedded as the first field in every type definition, making it the object header. As a result of this, any CPython object can be treated as an instance of type PyObject and the ob_refcnt count and ob_type fields can be accessed.

Definition of the PyObject struct in CPython from the file Include/object.h

Apart from these fields, if the CPython type is a sequence type, it also needs to provide its current size information to the runtime. Because this size information is not needed for other types, the sequence type objects have a special object header definition called PyVarObject. It extends PyObject and adds the size field.

Definition of the PyVarObject struct which forms the object header for sequence type objects in CPython

A sequence type object has an instance of PyVarObject as its first field so that it inherits those header fields and then the type can define its own fields after that. For example, the definition of the list type in CPython is shown below.

Definition of the list object in CPython

The following diagram summarizes this whole object model:

The object hierarchy that the CPython type system uses to implement the types

As you can see, the list object contains its size information in the ob_size field (inherited from PyVarObject). It means that the runtime can always simply lookup this field to know if the list is empty.

With this picture of how the sequence type objects are defined in CPython and how the runtime tracks their size, we are ready to discuss why there is such a stark difference in the performance of the two methods for doing emptiness checks.

Check out my article on PyObject or videos on CPython type system internals for more details on this topic.

The Fast Path: How Python Evaluates '`not mylist`'

We will start right from the bytecode instructions to see exactly how the CPython virtual machine (VM) evaluates this code. The following figure shows the bytecode instructions when we use the “not” operator on an object.

The bytecode instructions generated for doing emptiness check using truth value testing

The CPython VM is a stack based virtual machine. It uses the stack to store data and operands for its instructions. Each instruction pops values form the stack to get its arguments and it pushes its result back onto the stack. If you want a detailed walkthrough of how the CPython VM works, read my article on the internals of the CPython VM.

Let’s break down the above bytecode to understand what is going on:

LOAD_Fast: First, the list object on which the not operator needs to be applied has to be pushed onto the stack. This is done via the LOAD_FAST instruction. It finds the mylist object in the local symbol table. This instruction is pretty fast because it looks up an array of local symbols using the index given to it as argument (0 in this case - as you can see in the 3rd column).
TO_BOOL: The Python VM needs to evaluate the truth value of an object when it is used inside an if or while condition. For this the CPython compiler emits the TO_BOOL instruction which evaluates the truth value of the object at the top of the stack (mylist here). For sequence type objects, such as lists, their truth value is False if they are empty.
UNARY_NOT: The UNARY_NOT instruction takes the top value on the stack, applies the Boolean not operator on it and stores the result back as the top value on the stack. In this code, this becomes the result we are looking for - whether the list is empty or not.

In all of this, the TO_BOOL instruction is where the actual size of the list object is checked to decide its truth value. But it is not as simple as that. The CPython VM does not know that the given object is a list type, so it has to evaluate the truth value in a type agnostic way. Let’s see exactly what goes on when executing the TO_BOOL instruction.

The `TO_BOOL` Instruction Implementation

The following figure shows the implementation of the TO_BOOL instruction in the CPython VM:

Let’s understand what is happening in this code at a broader level. I have highlighted the four important parts:

This part is an optimization that the CPython interpreter implements called instruction specialization. It optimizes the generic TO_BOOL instruction into TO_BOOL_LIST (specialized for list type object) after the first execution. I explain this optimization in a later section down below.
If instruction specialization is not enabled, then the rest of the code executes. Here, the VM is calling the function PyObject_IsTrue() to evaluate the truth value of the object currently at the top of the stack.
As we will see shortly, PyObject_IsTrue() returns a value > 0 if the truth value of the object passed to it was true, else it returns 0. CPython translates it into its internal representations of True and False values.
The resulting truth value is placed at the top of the stack.

As we can see that the actual truth value of the object is evaluated in the PyObject_IsTrue() function, so let’s look inside it.

Inside the `PyObject_IsTrue` Function

The following figure shows the implementation of the PyObject_IsTrue function which evaluates the truth value of a given object passed to it as parameter.

The definition of the PyObject_IsTrue() function which is called by the TO_BOOL instruction to evaluate the truth value of the given object

As I said, the TO_BOOL instruction doesn’t know about the exact type of the object it is working with, therefore this function has to check the type of the object, and based on the type it has to decide how to evaluate the truth value.

The Python type system has a few broad type classes, such as numbers, sequences, mappings and each of them implements a different kind of behavior. This type information is provided to the runtime by the object via the ob_type field in its header. ob_type is a pointer to an object of type PyTypeObject and the following figure shows its definition:

Definition of the PyTypeObject struct along with the definition of some of the structs embedded in it via pointers, such as PyNumberMethods and PySequenceMethods

PyTypeObject contains fields indicating the object's type category. Number types set tp_as_number to point to a PyNumberMethods struct with numeric operator functions, while sequence types set tp_as_sequence to point to a PySequenceMethods struct with functions for operations like indexing, slicing, and length checking.

So, if we get back to the definition of the PyObject_IsTrue() function, it should make more sense now. You can see that the function checks the tp_as_number, tp_as_mapping, tp_as_sequence fields in the object’s header to check its type.

Sequence type objects implement a function to return their length and the sq_length function pointer in their tp_as_sequence struct instance points to this function. This is exactly how PyObject_IsTrue() is checking the length of the list. The following figure shows list object’s implementation of the length function called list_length:

The list object implements the list_length function to return its length. This function is called by the TO_BOOL instruction when evaluating the truth value of the list. But it is also called by the len() function to compute the list’s length

list_length returns the length of the list by looking up the ob_size field in the list object’s header.

Remember the object hierarchy we discussed in the first section of the article? Every list object inherits the fields of PyVarObject, and PyVarObject contains the ob_size field which holds the size of the list.

This is how the TO_BOOL instruction finds out the truth value of a list object. Even though it is doing a very simple thing, it is not particularly cheap because it has to go through multiple layers of pointer indirections.

However, if you are using CPython 3.12 or later, it isn’t a concern anymore because it will get specialized to a type optimized instruction, such as TO_BOOL_LIST for lists. Let’s understand how this optimization happens and how the TO_BOOL_LIST instruction works.

Specialized Instruction: `TO_BOOL` to `TO_BOOL_LIST`

The TO_BOOL instruction has to go through a twisted maze of pointers to do a very simple thing. The size info is stored right in the object header in its ob_size field, but TO_BOOL goes through five layers of pointer indirections to get to it. The following figure shows this maze of pointer chases. Interestingly, you will notice that the pointer chase starts from the list object and ends right back at its ob_size field.

The number of indirections that the TO_BOOL instruction has to go through the find the length of the list in order to decide the list’s truth value. In total there are five pointer dereferences involved. Interestingly, you can see that eventually the pointer dereferencing comes back to the same place from where it started — The number of indirections that the TO_BOOL instruction has to go through to find the length of the list in order to decide the list’s truth value. In total, there are five pointer dereferences involved. Interestingly, you can see that eventually the pointer dereferencing comes back to the same place from where it started

Pointer chases are expensive because each pointer dereference is a memory access and the deeper you go, the more likely you are to encounter a cache miss, making everything slower. A single L1 cache access costs about 4 cycles but a cache miss resulting in main memory access can cost 100-200 cycles.

But, since the 3.12 release, CPython has an optimization in place called instruction specialization which converts these slower versions of frequently executed instructions into faster specialized versions based on the types of the arguments.

For example, when the CPython VM executes the code “if not mylist” for the first time, it doesn’t know if mylist is a list type. But after the first invocation, it knows the type of the object and it smartly switches the TO_BOOL instruction with TO_BOOL_LIST which is optimized for list types.

The following figure shows the implementation of TO_BOOL_LIST:

Implementation of the TO_BOOL_LIST instruction which is optimized to find the truth value of list objects. Instead of doing 5 levels of pointer dereferences, it directly looks up the ob_size field of the list object.

The important thing to note here is 3️⃣. Unlike the TO_BOOL instruction which went through several levels of pointer indirections, this instruction directly looks at the ob_size field in the list object’s header to figure out the length.

It reduces five levels of indirection to a single level of indirection with no extra function calls (PyList_GET_SIZE expands to an inline function). This implies that if you have an emptiness check in a hot path of your code, it will reduce down to a very optimized version requiring a single memory access.

This concludes our exploration of the code path involving the truth value testing in CPython. Now, let’s look at what happens when you use the len() built-in to do the same emptiness check.

The Slow Path: How Python Processes `len()` Checks

Let’s start by looking at the bytecode instructions for using the len() builtin for doing emptiness check on a list:

The bytecode instructions for checking the emptiness of a list using len()

Here’s what happens:

LOAD_GLOBAL (len): The VM uses this instruction to find the len() built-in from the list of globals and built-ins and pushes it onto the stack. In older Python versions this used to be an expensive thing to do if the same function was called repeatedly in a hot loop. But thanks to instruction specialization (Python 3.12 release onwards), LOAD_GLOBAL gets optimized to a specialized instruction such as LOAD_GLOBAL_BUILTIN which is optimized for built-ins.
LOAD_FAST (mylist): Next, the list object itself has to be on the stack so that the VM can pass it as argument to len(). For this the VM uses the LOAD_FAST instruction which is very fast - it looks up the locals array using the provided index.
CALL: Now that the len() built-in and its argument are on the stack, the CALL instruction is used to call it. The return value of len() is pushed onto the stack.
LOAD_CONST and COMPARE_OP: Finally these two instructions are used to compare the result of len() with 0 to decide if the list is empty.

When you compare these instructions with that of the instructions from the previous section (truth value testing), couple of things stand out as expensive operations.

The CALL instruction results in a C function call to the implementation of len() which requires stack frame setup and a jump to the function. These become an overhead when done repeatedly in performance critical code, which is why compiled languages try to inline function calls as an optimization.
Additionally, the result of len() needs to be compared with 0 which is another couple of extra VM instructions that need to be executed every time an emptiness check is done. The truth value testing approach saves the VM these two instructions and a few CPU cycles which could be spent elsewhere.

Apart from these the len() function itself has overhead of how it checks the length of a list object that we have not looked at yet. Let’s look at the implementation of the len() built-in.

Implementation of The len() Built-in

The following figure shows the code for it from CPython from the file bltinmodule.c.

Definition of the len() built-in from CPython.

Again, as you can see, the len() function also needs to check the type of the object to decide how to find out its length. This is a theme that recurs in all dynamically typed languages and becomes problematic for their performance.

If you recall from our discussion on PyTypeObject: sequence types set the tp_as_sequence field which contains a function pointer called sq_length that returns the length of the object.

And, we have already seen the implementation of sq_length in lists during our discussion of the TO_BOOL instruction: it returns the size of the list object by looking up the ob_size field in its header.

The Additional Cost: Comparing With Zero

Once the len() call returns back to the VM, it compares it with 0 to decide whether the list is empty or not. This becomes another couple of extra operations that the VM needs to do every time you do an emptiness check using len(), which adds to the overall cost to this way of doing emptiness check.

Having examined both approaches in detail, we can now clearly see why the truth value method is approximately twice as fast. Let's summarize the key differences.

Performance Breakdown: The Complete Picture

When Doing “`if not mylist`”

When you use truth value testing to check a list for emptiness, e.g. “if not mylist”, the CPython compiler generates the TO_BOOL instruction to compute the truth value of the list.
While the TO_BOOL instruction itself requires going through five levels of pointer dereferencing, after one invocation, the VM optimizes it to TO_BOOL_LIST which can directly look at the size of the list in the object header in a single pointer dereference.
This means, if you are repeatedly checking if a list is empty or not, the CPython VM just needs to do a single memory access for the check.

When Doing “`if len(mylist) == 0`”

When using len(), the VM needs to find and load the built-in onto the stack. This requires searching the globals and built-ins hash tables. Although this also gets optimized to a faster version where the VM directly searches the built-ins table.
After this the VM needs to actually execute the len() function. In hot paths, function calls can have overheads due to stack frame setup, register spills, and control flow jumps.
The len() function itself has to check the type of the object. For sequence types, it has to dereference the sq_length function pointer in the tp_as_sequence field in the object’s header. All of this is again five levels of pointer dereferencing to get to the function which will provide the list’s length.
Finally, the VM needs to compare the length of the list with 0 to decide if it is empty.

This stark difference in execution paths explains why the truth value approach is almost twice as fast:

A side-by-side comparison of the main performance overhead involved in using the two approaches for doing emptiness checks in Python

This analysis demonstrates why idiomatic Python often (but not always) performs better - what seems like a stylistic preference for using if not mylist is actually rooted in significant performance advantages at the VM implementation level.

Python Performance: Why 'if not list' is 2x Faster Than Using len()

Discover why 'if not mylist' is twice as fast as 'len(mylist) == 0' by examining CPython's VM instructions and object memory access patterns.

TL;DR

Article Structure

Understanding the Python List Object Layout

The Fast Path: How Python Evaluates 'not mylist'

The TO_BOOL Instruction Implementation

Inside the PyObject_IsTrue Function

Remember the object hierarchy we discussed in the first section of the article? Every list object inherits the fields of PyVarObject, and PyVarObject contains the ob_size field which holds the size of the list.

Specialized Instruction: TO_BOOL to TO_BOOL_LIST

The Slow Path: How Python Processes len() Checks

Implementation of The len() Built-in

The Additional Cost: Comparing With Zero

Performance Breakdown: The Complete Picture

When Doing “if not mylist”

When Doing “if len(mylist) == 0”

Further Resources

Discussion about this post