Yesterday, we did the live session on the internals of remote sampling profilers. We learned the internals that are required to build such tools. Building these tools is probably one of the most interesting systems programming projects that you can do to not only learn the internals of a programming language, but also learn the ELF file format.
Specifically, a remote sampling profiler for a language like Python is able to attach to a running Python process. It is then able to read the memory of that process to figure out what Python code is executing on the interpreter and extract its stack trace.
To be able to do this, three pieces are required:
Ability to read the memory of an external process
Knowledge of the Python runtime data structures that you need to traverse to extract the stack trace once you are able to read the memory of the Python process.
Figuring out the memory addresses of these data structures in the memory of the running Python process. This is the most tricky and interesting part. It requires parsing the python executable binary file to read the memory address that the linker has specified where the runtime data structure should be loaded. And it also requires understanding of how the program loader loads the executable into memory.
We covered all of this yesterday. It was quite a bit of detail to go through but worth it!
You can also access the slides at the below link: