Live Session: CPython and ELF Essentials for Building a Basic Remote Profiler
Learn some CPython internals, ELF file format and loading, and how remote profilers work
In the previous article we covered everything about the internals of the CPython runtime — its key data structures and how they are initialized.
Even though the main role of the runtime state is in Python code execution on the interpreter (which we will cover in the next article), the runtime also enables debugging tools such as remote debuggers and profilers as well.
A remote profiler or a remote debugger attaches to a running Python process and then allows you to collect information about the Python code that is currently executing in that process. You can do profiling, e.g., collect stack traces over a period of time to find out where the interpreter is spending the time, or you can just dump the current stack trace to see the state of the interpreter. You can do more interesting things based on all the information that the runtime exposes to you.
Now, what do you need to know in order to build something like this? It is an extremely diverse list of topics that comes together to allow you to build one of these tools. Here’s the list:
The CPython runtime internals and the key data structures — you need to know your way around the runtime objects.
ELF file format — ELF is the file format for the executable on most Unix like systems such as Linux and BSDs. It contains the compiled code and data and how it will be laid out in memory when we start the Python executable. We need to know how to parse it so we can find out the address where the runtime state would be loaded in a running Python process.
How ELF files are loaded — We also need to know how the ELF file is loaded into memory by the loader when we start the Python process.
How to efficiently read another process’ memory: The remote debugger/profiler needs to attach to a running Python process and read its memory. How do we do that? What is the overhead?
Connecting all the pieces together and building a very simple remote profiler
If you are interested in learning all of these things, then you must sign up for this live session
Prerequisites
It would be great if you have read the article on the CPython runtime internals. But I will cover the important bits at a high level.
C — familiarity with structs and pointers in C. You don’t have to be an expert. And you can stop me if a particular piece of code is not clear during the discussion.
Date & Time
2nd June, 16:00 UTC to 18:30 UTC (2.5 hours)
(This is a lot of surface area to cover, so things might overflow a bit.)
How to Sign Up
As usual, it is free for the paid supporters. You can RSVP at the link in the next section. If you are not a paid supporter, consider upgrading to a paid subscription
If you’ve trouble paying on Substack, you can become a member at buymeacoffee and I will upgrade you to a complementary paid subscription here.
If you don’t want a subscription then you can attend by buying a ticket to the session at the below link.
There is limited 100 seats to attend so do hurry up. This is a very interesting topic and a great way to learn about ELF files and their loading, apart from CPython internals and how tools like remote debuggers work.
Reminder for Upcoming Live Session Tomorrow
Also, I’d like to remind that there is a live session on May 11th (tomorrow) on CPython’s memory management internals. More detail in the below post:
RSVP Link
Please register at the following link to book your seat for this event:
The session is over now. You can access the recording at the below link: