I work on Cinder, a just-in-time (JIT) compiler built on top of CPython. If you aren’t familiar with Cinder and want to learn more, a previous post about the inliner gives a decent overview of the JIT. This post will talk about our function symbolizer, why we added it, and how it works.
To follow along, check out symbolizer.cpp. If you notice something amiss, please let me know! Either send me an email, post on ~max/blog-comments, or comment on one of the various angry internet sites this will eventually get posted to.
The JIT transforms Python bytecode to machine code. Along the way, we support printing the intermediate representations (IRs) for debugging. We also support disassembling the resulting machine code for the same reason.
The various IRs and machine code contain references to C and C++ functions by
address. While a running process only needs the address to go about its job,
software engineers like me need a little more than 0x3A28213A
to debug
things. This leaves us wanting a function that can go from address to function
name: a symbolizer.
You might wonder why we don’t instead keep all of the names inside the
instructions. After all, we probably add the function pointers by name (like
env.emit<CallCFunction>(PyNumber_Add, ...)
. Why not also add the string
"PyNumber_Add"
alongside it?
I quite honestly do not have a good answer. I think it would take work to thread all of that additional information through the system so that we can guarantee it, but:
In the end I decided to do what other projects like HHVM seem to do and wrote the darn symbolizer.
I wanted names in our debug output. Seeing stuff like
CallCFunction<0x6339392C> v1 v7
was driving me batty.1 How am I
supposed to know what that represents? Sure, I can kind of make an inference
from the context, but it’s not pleasant.
And it’s even worse in the machine code, where there are no names. Take a look at an example dump of assembly code from the JIT:
Epilogue
mov 0x118(%rdi),%rsi
btq $0x0,0x8(%rsi)
mov (%rsi),%rsi
mov %rsi,0x118(%rdi)
jae 0x7f19fdf1d2f3
mov %rax,-0x8(%rbp)
callq *0x69(%rip) # 0x7f19fdf1d358
mov -0x8(%rbp),%rax
We can see that the disassembler has helpfully annotated the RIP-relative call with the address it found later in the instruction stream. But that number is still meaningless to me. I would much rather have the following:
Epilogue
mov 0x118(%rdi),%rsi
btq $0x0,0x8(%rsi)
mov (%rsi),%rsi
mov %rsi,0x118(%rdi)
jae 0x7f19fdf1d2f3
mov %rax,-0x8(%rbp)
callq *0x69(%rip) # 0x7f19fdf1d358 (JITRT_UnlinkFrame(_ts*))
mov -0x8(%rbp),%rax
Beautiful. A crisp, clear function name. So how do we get there?
For some cases in the project we already used dladdr
as a limited symbolizer.
Unfortunately, dladdr
only works if the function is in some .so
that your
application loaded. If you are trying to symbolize a function from your own
executable, you’re out of luck.
I learned somewhere that at least for ELF binaries (and probably other executable formats), there are names stored in the header. I had no idea how to read my own ELF header. I tried to read from the start of the executable and found an ELF header! It was great! And then I tried to read a section header and got a segfault.
I learned (from Employed Russian, as apparently everybody who works on low-level things does) that section headers are not loaded into memory at process start. Bummer. So how do we read the header?
Well, we loaded the executable from the disk on process boot. Why not read it
again? I went off to mmap
the file /proc/self/exe
so that I could read from
that instead.
I had some crashes, so I went to see if Valgrind could track down anything
weird for me. It turns out, though, that Valgrind had a bug
where it wouldn’t intercept the open
of /proc/self/exe
for the mmap
, so
actually I was reading Valgrind’s executable instead of my own when trying to
track down my memory error. Talk about multiple levels of confusion. At the
time of symbolizer writing, the bug had been fixed, but I did not have the
latest version on hand.
I finally got the constructor working:
Symbolizer::Symbolizer(const char* exe_path) {
int exe_fd = ::open(exe_path, O_RDONLY);
if (exe_fd == -1) {
JIT_LOG("Could not open %s: %s", exe_path, ::strerror(errno));
return;
}
// Close the file descriptor. We don't need to keep it around for the mapping
// to be valid and if we leave it lying around then some CPython tests fail
// because they rely on specific file descriptor numbers.
SCOPE_EXIT(::close(exe_fd));
struct stat statbuf;
int stat_result = ::fstat(exe_fd, &statbuf);
if (stat_result == -1) {
JIT_LOG("Could not stat %s: %s", exe_path, ::strerror(errno));
return;
}
off_t exe_size_signed = statbuf.st_size;
JIT_CHECK(exe_size_signed >= 0, "exe size should not be negative");
exe_size_ = static_cast<size_t>(exe_size_signed);
exe_ = reinterpret_cast<char*>(
::mmap(nullptr, exe_size_, PROT_READ, MAP_PRIVATE, exe_fd, 0));
if (exe_ == reinterpret_cast<char*>(MAP_FAILED)) {
JIT_LOG("could not mmap");
exe_ = nullptr;
return;
}
auto elf = reinterpret_cast<ElfW(Ehdr)*>(exe_);
auto shdr = reinterpret_cast<ElfW(Shdr)*>(exe_ + elf->e_shoff);
const char* str = exe_ + shdr[elf->e_shstrndx].sh_offset;
for (int i = 0; i < elf->e_shnum; i++) {
if (shdr[i].sh_size) {
if (std::strcmp(&str[shdr[i].sh_name], ".symtab") == 0) {
symtab_ = reinterpret_cast<ElfW(Shdr)*>(&shdr[i]);
} else if (std::strcmp(&str[shdr[i].sh_name], ".strtab") == 0) {
strtab_ = reinterpret_cast<ElfW(Shdr)*>(&shdr[i]);
}
}
}
// ...
}
In this blob of constructor code, we:
open
the file to get a file descriptorfstat
the file to get its sizemmap
the file so we can read from its contents.symtab
and .strtab
Through a bunch of trial and error and reading too much half-working code on
the internet and too many manual pages, I got the symbolizer working! I managed
to make it symbolize function names from our executable and fall back to
dladdr
for symbols shared objects.
std::optional<std::string_view> Symbolizer::symbolize(const void* func) {
// Try the cache first. We might have looked it up before.
auto cached = cache_.find(func);
if (cached != cache_.end()) {
return cached->second;
}
// Then try dladdr. It might be able to find the symbol.
Dl_info info;
if (::dladdr(func, &info) != 0 && info.dli_sname != nullptr) {
return cache(func, info.dli_sname);
}
if (!isInitialized()) {
return std::nullopt;
}
// Fall back to reading our own ELF header.
auto sym = reinterpret_cast<ElfW(Sym)*>(exe_ + symtab_->sh_offset);
const char* str = exe_ + strtab_->sh_offset;
for (size_t i = 0; i < symtab_->sh_size / sizeof(ElfW(Sym)); i++) {
if (reinterpret_cast<void*>(sym[i].st_value) == func) {
return cache(func, str + sym[i].st_name);
}
}
// ...
}
In this snippet, we:
unordered_map
dladdr
Problem solved, right? Nope:
.so
. If we reference a private symbol from the
Cinder .so
, our fancy symbol table walker won’t be able to find it because
it only reads from the executable. dladdr
won’t be able to resolve it
either.I borrowed some of our tech lead Matt Page’s code for reading .so
s and that
solved those problems. I’m not super sure why this code is different from the
code for reading the executable ELF header. It looks very similar. Maybe they
can be combined. If you’re going to borrow some of our code, his looks more
correct. It handles more edge cases.
Then, finally, since we’re using C++, we get fun mangled names. I used
abi::__cxa_demangle
to get a nice readable name.
This symbolizer only supports Linux/ELF. It won’t work on macOS, which uses
Mach-O. I have no idea about BSDs and friends. It should be 32-bit compatible
out of the box, though, due to use of ELfW
instead of its explicitly-sized
variants.
A symbolizer that has to support very few platforms can be written in a couple hundred lines and understood. Hopefully it’s reusable. Let me know what weird bugs you run into if you use it.
This isn’t meant to be fast. I have no idea how very slow it is. Please don’t tell me. I added a cache so I wouldn’t have to think about it too hard. The thought is that once a symbol is looked up, it will likely be looked up again in dumps of a future compiler pass.
It’s going to be a minute before I go spelunking through ELF again.
Google’s Abseil library includes a symbolizer. Same with the folly library: symbolizer and elf utils.
Also, I did a lot of reading through ClickHouse’s symbol indexer and it helped clear up some misconceptions and outright errors from various StackOverflow answers.
HHVM has a symbolizer that relies on Folly but apparently they also support using LibBFD.
Last, I heard that you can use libbacktrace
as a sort of symbolizer, but you
need to link with -rdynamic
and it won’t find static/private symbols.
ELF from scratch by Conrad Kleinespel helped me understand how the headers are laid out. Diagrams are nice and all but it’s super helpful to see sample code iterating over section headers and stuff.
Actually, it’s worse than that. We didn’t even print the addresses originally. ↩