Hot reloading: polymorphism

Hot reloading has a suite of problems to solve, but in perhaps at its most fundamental it must correct invalidated pointers. When a module is unloaded and loaded again, we cannot guarantee that it will appear in the same memory location. This means that any polymorphic objects defined in that module are now ticking time bombs, and all compiler-generated type information in that module is now invalid. The next time our program uses virtual functions, dynamic_cast, or typeid, we’d crash1–the built-in mechanisms for facilitating those now point to bad memory.

Almost all modern compilers use a vtable for all type info, which is a mechanism for resolving override polymorphism. Short for “virtual function table”, it contains pointers to the correct version of each virtual function for the corresponding type. By convention, compilers quietly insert a pointer to that type’s vtable (a vptr) at the start of the object layout, but in our case this pointer is the cause of the access violation exception.

Sanable addresses the reload problem using a technique called vptr jamming or pointer hydration, which copies vptrs from one object to all living objects. A naive implementation (like Sanable v1) might rewrite just the first 4/8 bytes:

MyObject dummy; //Never used directly, just stealing a fresh vptr

memcpy(badObj, &dummy, sizeof(void*)); //All pointers have the same size, regardless of the pointed-to type

badObj->update(); //No longer crashes!

While this is easy to implement and test, it isn’t consistent. With multiple inheritance or virtual inheritance, we’d have multiple vptrs to overwrite, locking us out of using interface classes (IUpdatable). Additionally, compilers are allowed to store metadata however they see fit: MSVC sometimes adds an additional type info field right after the vptr, and some compilers skip the vtable to put function pointers in the object itself.

We can refine our approach if we know the exact layout of an object in memory, which normally would mean writing a compiler plugin. However, that would only yield information for that specific compiler. Thanks to the offsetof macro, we can identify which bytes correspond to explicitly defined fields, and by process of elimination which bytes were generated by the compiler. These gaps must be either compiler-generated constants (metadata we should capture), or padding generated to respect alignment requirements (safe to write/ignore).

Note: most optimizing compilers will append Derived’s virtual members to the “Derived cast to BaseA” vtable, so only one vptr would be emitted. (Source: Effective C++)

Writing out the members of every class is fragile and hard to reliably check for errors, so I wrote a dedicated tool for Sanable v2. While this uses Clang’s Abstract Syntax Tree, it runs as a pre-build step and generates source code that any compiler can use.

We can take this approach one step even further, differentiating between what is padding and what is an implicit field. We can create multiple dummy objects, first filling the memory they will occupy to some known value unique to that dummy. Constructors shouldn’t2 touch padding bytes (see this post), so the bytes that always match the known value must be padding. Bytes that change but don’t match the known value are instead marked as unknown/anomalous, and ignored.

Further steps

A significant drawback with this approach is that constructors must be called. Default constructors are the easy option, followed by constructors with trivial arguments that can be cast from 0, and default-constructible arguments. Nested nontrivial constructors are almost impossible to programmatically resolve. This also assumes that the constructors have no side effects.

One potential workaround would be to analyze bytecode at runtime. Although it would require per-platform definitions for instructions, it could be performed with any constructor without any side effects. The main instructions to look for would be mov/lea for setting vptrs, and call/ret to follow the full constructor hierarchy.

  1. An access violation (Windows) or segmentation fault (*nix). There’s no well-defined way to catch or recover from a crash like this, and even platform-specific methods may not work consistently. atexit hooks will still be called. ↩︎
  2. GCC is known to pre-fill objects’ entire memory, padding included. ↩︎