Live Session: CPython and ELF Essentials for Building a Basic Remote Profiler

Learn some CPython internals, ELF file format and loading, and how remote profilers work

May 11, 2024

In the previous article we covered everything about the internals of the CPython runtime — its key data structures and how they are initialized.

Even though the main role of the runtime state is in Python code execution on the interpreter (which we will cover in the next article), the runtime also enables debugging tools such as remote debuggers and profilers as well.

A remote profiler or a remote debugger attaches to a running Python process and then allows you to collect information about the Python code that is currently executing in that process. You can do profiling, e.g., collect stack traces over a period of time to find out where the interpreter is spending the time, or you can just dump the current stack trace to see the state of the interpreter. You can do more interesting things based on all the information that the runtime exposes to you.

Now, what do you need to know in order to build something like this? It is an extremely diverse list of topics that comes together to allow you to build one of these tools. Here’s the list:

The CPython runtime internals and the key data structures — you need to know your way around the runtime objects.
ELF file format — ELF is the file format for the executable on most Unix like systems such as Linux and BSDs. It contains the compiled code and data and how it will be laid out in memory when we start the Python executable. We need to know how to parse it so we can find out the address where the runtime state would be loaded in a running Python process.
How ELF files are loaded — We also need to know how the ELF file is loaded into memory by the loader when we start the Python process.
How to efficiently read another process’ memory: The remote debugger/profiler needs to attach to a running Python process and read its memory. How do we do that? What is the overhead?
Connecting all the pieces together and building a very simple remote profiler

If you are interested in learning all of these things, then you must sign up for this live session

Prerequisites

It would be great if you have read the article on the CPython runtime internals. But I will cover the important bits at a high level.
C — familiarity with structs and pointers in C. You don’t have to be an expert. And you can stop me if a particular piece of code is not clear during the discussion.

Date & Time

2nd June, 16:00 UTC to 18:30 UTC (2.5 hours)

(This is a lot of surface area to cover, so things might overflow a bit.)

How to Sign Up

As usual, it is free for the paid supporters. You can RSVP at the link in the next section. If you are not a paid supporter, consider upgrading to a paid subscription

If you’ve trouble paying on Substack, you can become a member at buymeacoffee and I will upgrade you to a complementary paid subscription here.

There is limited 100 seats to attend so do hurry up. This is a very interesting topic and a great way to learn about ELF files and their loading, apart from CPython internals and how tools like remote debuggers work.

Reminder for Upcoming Live Session Tomorrow

Also, I’d like to remind that there is a live session on May 11th (tomorrow) on CPython’s memory management internals. More detail in the below post:

Live Session: CPython Memory Management Internals

Abhinav Upadhyay

April 30, 2024

Live Session: CPython Memory Management Internals

Last week we concluded the live session on the internals of the CPython’s main bytecode interpreter (the VM), and the response from the attendees has been very encouraging. Next, I want to talk about how CPython implements memory management in its runtime. Most programming languages with managed runtimes use a conventional tracing garbage collector (GC), however, CPython uses a combination of reference counting and a conventional GC. We will cover how it does that and the following details:

Read full story

RSVP Link

~~Please register at the following link to book your seat for this event:~~

The session is over now. You can access the recording at the below link:

Recording: CPython and ELF Essentials for Building a Basic Remote Profiler