Python Performance: Why 'if not list' is 2x Faster Than Using len()
Discover why 'if not mylist' is twice as fast as 'len(mylist) == 0' by examining CPython's VM instructions and object memory access patterns.
In Python, you can check a list for emptiness in two ways:
if not mylist
if len(mylist) == 0
While the 2nd approach is not wrong, the first method is considered more Pythonic. Many people don’t agree, but I’ve already put forward my points in a previous article on that debate.
Apart from being more Pythonic, the first version is ~2x faster than the other, see for yourself:
That's an almost 2x performance difference for something as fundamental as checking if a collection is empty. This operation occurs millions of times in any non-trivial Python application.
The disparity becomes even more intriguing when you consider that both approaches must ultimately determine the same thing: does the collection contain any elements? What causes this surprisingly large gap? The answer lies in CPython's implementation details:
How objects are organized in memory
How the virtual machine processes bytecode instructions
The hidden costs of seemingly simple operations, such as calling the built-in
len()
.
In this article, we'll follow both the execution paths instruction by instruction, uncovering why a small syntactic difference translates to double the execution time.
TL;DR
If you’re short on time, here are the key insights covered in the article:
In Python, checking if a collection is empty can be done with either
if not mylist
or,
if len(mylist) == 0
The first approach is consistently 2x faster.
This performance difference exists because “
if not mylist
” requires only 2 VM instructions (LOAD_GLOBAL
,TO_BOOL
), while “if len(mylist) == 0
” needs 5 instructions (LOAD_GLOBAL
,LOAD_FAST
,CALL
,LOAD_CONST
,COMPARE_OP
).Both the
TO_BOOL
instruction and thelen()
built-in function check the size of the list through five layers of pointer indirection.However, the VM optimizes
TO_BOOL
toTO_BOOL_LIST
which is a specialized instruction to evaluate the truth value of list type objects, and it can, it can do the same thing in a single memory access.Apart from five levels of pointer dereferences, using
len()
also has the overhead of function calls, which adds up.For hot loops or performance-critical code, the idiomatic approach (
if not mylist
) is more Pythonic and measurably more efficient.

Article Structure
In order to understand why this performance difference exists, we will need to cover the following internal details of CPython implementation.
CPython’s internal object model
Bytecode execution in the CPython VM
Internal optimizations of the VM such as instruction specialization
We will start by understanding the definition and layout of the list object in CPython. This information is vital for the rest of the article because ultimately both the approaches need to check the size of the list and that information is stored in the object.
After that we will dive into the internals of both the methods of doing emptiness checks and understand what are the expensive parts.
Finally we will summarize by doing a side-by-side comparison of the two approaches.
Understanding the Python List Object Layout
First of all, let’s understand the definition and in-memory layout of the list object in CPython. It will give us insight about how the CPython runtime tracks the size of sequence type objects.
Different object types in CPython need different fields depending on their requirements. For example, a float object needs a double
type field to store its value, while a list type needs to have a dynamic array and its size to store the list data.
But, there are certain common fields that all object types need to have for the CPython runtime to manage them. These include the object reference count, the type information and the size of the object (only needed for collection types, such as lists, dicts, strings).
To make it easier for the CPython runtime to access these common fields in objects of different types, CPython implements an object hierarchy (AKA inheritance model). This ensures that even without knowing the concrete type of the object, the runtime can access these common fields whenever it needs.
The two common fields that every CPython type needs to have are its reference count (ob_refcnt
) and the type (ob_type
) related information. These are kept in a struct called PyObject
which is embedded as the first field in every type definition, making it the object header. As a result of this, any CPython object can be treated as an instance of type PyObject
and the ob_refcnt
count and ob_type
fields can be accessed.
Apart from these fields, if the CPython type is a sequence type, it also needs to provide its current size information to the runtime. Because this size information is not needed for other types, the sequence type objects have a special object header definition called PyVarObject
. It extends PyObject and adds the size field.

A sequence type object has an instance of PyVarObject
as its first field so that it inherits those header fields and then the type can define its own fields after that. For example, the definition of the list type in CPython is shown below.
The following diagram summarizes this whole object model:
As you can see, the list object contains its size information in the ob_size
field (inherited from PyVarObject
). It means that the runtime can always simply lookup this field to know if the list is empty.
With this picture of how the sequence type objects are defined in CPython and how the runtime tracks their size, we are ready to discuss why there is such a stark difference in the performance of the two methods for doing emptiness checks.
Check out my article on PyObject or videos on CPython type system internals for more details on this topic.
The Fast Path: How Python Evaluates 'not mylist
'
We will start right from the bytecode instructions to see exactly how the CPython virtual machine (VM) evaluates this code. The following figure shows the bytecode instructions when we use the “not
” operator on an object.
The CPython VM is a stack based virtual machine. It uses the stack to store data and operands for its instructions. Each instruction pops values form the stack to get its arguments and it pushes its result back onto the stack. If you want a detailed walkthrough of how the CPython VM works, read my article on the internals of the CPython VM.
Let’s break down the above bytecode to understand what is going on:
LOAD_Fast: First, the list object on which the
not
operator needs to be applied has to be pushed onto the stack. This is done via theLOAD_FAST
instruction. It finds themylist
object in the local symbol table. This instruction is pretty fast because it looks up an array of local symbols using the index given to it as argument (0 in this case - as you can see in the 3rd column).TO_BOOL: The Python VM needs to evaluate the truth value of an object when it is used inside an if or while condition. For this the CPython compiler emits the
TO_BOOL
instruction which evaluates the truth value of the object at the top of the stack (mylist
here). For sequence type objects, such as lists, their truth value is False if they are empty.UNARY_NOT: The
UNARY_NOT
instruction takes the top value on the stack, applies the Booleannot
operator on it and stores the result back as the top value on the stack. In this code, this becomes the result we are looking for - whether the list is empty or not.
In all of this, the TO_BOOL
instruction is where the actual size of the list object is checked to decide its truth value. But it is not as simple as that. The CPython VM does not know that the given object is a list type, so it has to evaluate the truth value in a type agnostic way. Let’s see exactly what goes on when executing the TO_BOOL
instruction.
The TO_BOOL
Instruction Implementation
The following figure shows the implementation of the TO_BOOL
instruction in the CPython VM:
Let’s understand what is happening in this code at a broader level. I have highlighted the four important parts:
This part is an optimization that the CPython interpreter implements called instruction specialization. It optimizes the generic
TO_BOOL
instruction intoTO_BOOL_LIST
(specialized for list type object) after the first execution. I explain this optimization in a later section down below.If instruction specialization is not enabled, then the rest of the code executes. Here, the VM is calling the function
PyObject_IsTrue()
to evaluate the truth value of the object currently at the top of the stack.As we will see shortly,
PyObject_IsTrue()
returns a value > 0 if the truth value of the object passed to it was true, else it returns 0. CPython translates it into its internal representations ofTrue
andFalse
values.The resulting truth value is placed at the top of the stack.
As we can see that the actual truth value of the object is evaluated in the PyObject_IsTrue()
function, so let’s look inside it.
Inside the PyObject_IsTrue
Function
The following figure shows the implementation of the PyObject_IsTrue
function which evaluates the truth value of a given object passed to it as parameter.

As I said, the TO_BOOL
instruction doesn’t know about the exact type of the object it is working with, therefore this function has to check the type of the object, and based on the type it has to decide how to evaluate the truth value.
The Python type system has a few broad type classes, such as numbers, sequences, mappings and each of them implements a different kind of behavior. This type information is provided to the runtime by the object via the ob_type
field in its header. ob_type
is a pointer to an object of type PyTypeObject
and the following figure shows its definition:

PyTypeObject
contains fields indicating the object's type category. Number types set tp_as_number
to point to a PyNumberMethods
struct with numeric operator functions, while sequence types set tp_as_sequence
to point to a PySequenceMethods
struct with functions for operations like indexing, slicing, and length checking.
So, if we get back to the definition of the PyObject_IsTrue()
function, it should make more sense now. You can see that the function checks the tp_as_number
, tp_as_mapping
, tp_as_sequence
fields in the object’s header to check its type.
Sequence type objects implement a function to return their length and the sq_length
function pointer in their tp_as_sequence
struct instance points to this function. This is exactly how PyObject_IsTrue()
is checking the length of the list. The following figure shows list object’s implementation of the length function called list_length
:

list_length
returns the length of the list by looking up the ob_size
field in the list object’s header.
Remember the object hierarchy we discussed in the first section of the article? Every list object inherits the fields of
PyVarObject
, andPyVarObject
contains theob_size
field which holds the size of the list.
This is how the TO_BOOL
instruction finds out the truth value of a list object. Even though it is doing a very simple thing, it is not particularly cheap because it has to go through multiple layers of pointer indirections.
However, if you are using CPython 3.12 or later, it isn’t a concern anymore because it will get specialized to a type optimized instruction, such as TO_BOOL_LIST
for lists. Let’s understand how this optimization happens and how the TO_BOOL_LIST
instruction works.
Specialized Instruction: TO_BOOL
to TO_BOOL_LIST
The TO_BOOL
instruction has to go through a twisted maze of pointers to do a very simple thing. The size info is stored right in the object header in its ob_size field, but TO_BOOL
goes through five layers of pointer indirections to get to it. The following figure shows this maze of pointer chases. Interestingly, you will notice that the pointer chase starts from the list object and ends right back at its ob_size
field.

Pointer chases are expensive because each pointer dereference is a memory access and the deeper you go, the more likely you are to encounter a cache miss, making everything slower. A single L1 cache access costs about 4 cycles but a cache miss resulting in main memory access can cost 100-200 cycles.
But, since the 3.12 release, CPython has an optimization in place called instruction specialization which converts these slower versions of frequently executed instructions into faster specialized versions based on the types of the arguments.
For example, when the CPython VM executes the code “if not mylist
” for the first time, it doesn’t know if mylist
is a list type. But after the first invocation, it knows the type of the object and it smartly switches the TO_BOOL
instruction with TO_BOOL_LIST
which is optimized for list types.
The following figure shows the implementation of TO_BOOL_LIST
:

The important thing to note here is 3️⃣. Unlike the TO_BOOL
instruction which went through several levels of pointer indirections, this instruction directly looks at the ob_size
field in the list object’s header to figure out the length.
It reduces five levels of indirection to a single level of indirection with no extra function calls (PyList_GET_SIZE
expands to an inline function). This implies that if you have an emptiness check in a hot path of your code, it will reduce down to a very optimized version requiring a single memory access.
This concludes our exploration of the code path involving the truth value testing in CPython. Now, let’s look at what happens when you use the len()
built-in to do the same emptiness check.
The Slow Path: How Python Processes len()
Checks
Let’s start by looking at the bytecode instructions for using the len()
builtin for doing emptiness check on a list:
Here’s what happens:
LOAD_GLOBAL (len): The VM uses this instruction to find the
len()
built-in from the list of globals and built-ins and pushes it onto the stack. In older Python versions this used to be an expensive thing to do if the same function was called repeatedly in a hot loop. But thanks to instruction specialization (Python 3.12 release onwards),LOAD_GLOBAL
gets optimized to a specialized instruction such asLOAD_GLOBAL_BUILTIN
which is optimized for built-ins.LOAD_FAST (mylist): Next, the list object itself has to be on the stack so that the VM can pass it as argument to
len()
. For this the VM uses theLOAD_FAST
instruction which is very fast - it looks up the locals array using the provided index.CALL: Now that the
len()
built-in and its argument are on the stack, theCALL
instruction is used to call it. The return value oflen()
is pushed onto the stack.LOAD_CONST and COMPARE_OP: Finally these two instructions are used to compare the result of
len()
with 0 to decide if the list is empty.
When you compare these instructions with that of the instructions from the previous section (truth value testing), couple of things stand out as expensive operations.
The
CALL
instruction results in a C function call to the implementation oflen()
which requires stack frame setup and a jump to the function. These become an overhead when done repeatedly in performance critical code, which is why compiled languages try to inline function calls as an optimization.Additionally, the result of
len()
needs to be compared with 0 which is another couple of extra VM instructions that need to be executed every time an emptiness check is done. The truth value testing approach saves the VM these two instructions and a few CPU cycles which could be spent elsewhere.
Apart from these the len()
function itself has overhead of how it checks the length of a list object that we have not looked at yet. Let’s look at the implementation of the len()
built-in.
Implementation of The len() Built-in
The following figure shows the code for it from CPython from the file bltinmodule.c.
Again, as you can see, the len()
function also needs to check the type of the object to decide how to find out its length. This is a theme that recurs in all dynamically typed languages and becomes problematic for their performance.
If you recall from our discussion on PyTypeObject
: sequence types set the tp_as_sequence
field which contains a function pointer called sq_length
that returns the length of the object.
And, we have already seen the implementation of sq_length
in lists during our discussion of the TO_BOOL
instruction: it returns the size of the list object by looking up the ob_size
field in its header.
The Additional Cost: Comparing With Zero
Once the len()
call returns back to the VM, it compares it with 0 to decide whether the list is empty or not. This becomes another couple of extra operations that the VM needs to do every time you do an emptiness check using len()
, which adds to the overall cost to this way of doing emptiness check.
Having examined both approaches in detail, we can now clearly see why the truth value method is approximately twice as fast. Let's summarize the key differences.
Performance Breakdown: The Complete Picture
When Doing “if not mylist
”
When you use truth value testing to check a list for emptiness, e.g. “
if not mylist
”, the CPython compiler generates theTO_BOOL
instruction to compute the truth value of the list.While the TO_BOOL instruction itself requires going through five levels of pointer dereferencing, after one invocation, the VM optimizes it to
TO_BOOL_LIST
which can directly look at the size of the list in the object header in a single pointer dereference.This means, if you are repeatedly checking if a list is empty or not, the CPython VM just needs to do a single memory access for the check.
When Doing “if len(mylist) == 0
”
When using
len()
, the VM needs to find and load the built-in onto the stack. This requires searching the globals and built-ins hash tables. Although this also gets optimized to a faster version where the VM directly searches the built-ins table.After this the VM needs to actually execute the
len()
function. In hot paths, function calls can have overheads due to stack frame setup, register spills, and control flow jumps.The
len()
function itself has to check the type of the object. For sequence types, it has to dereference thesq_length
function pointer in thetp_as_sequence
field in the object’s header. All of this is again five levels of pointer dereferencing to get to the function which will provide the list’s length.Finally, the VM needs to compare the length of the list with 0 to decide if it is empty.
This stark difference in execution paths explains why the truth value approach is almost twice as fast:

This analysis demonstrates why idiomatic Python often (but not always) performs better - what seems like a stylistic preference for using if not mylist
is actually rooted in significant performance advantages at the VM implementation level.
Communicated so clearly. Thank you for writing about this. I can picture myself linking to it in a future PR comment.