Hot reloading: dllimport

This is part of an ongoing series on hot reloading, a core feature of Sanable Engine. It is closely related to Unreal’s live coding and Visual Studio’s edit-and-continue: those systems only rewrite changed bytecode, whereas Sanable’s hot reload feature unloads and reloads an entire DLL while also fixing object layouts.

How does dllimport work, anyway?

Let’s take a look at the behavior of the MSVC compiler/linker–though it’s worth noting that most platforms function very similarly. Dllexport adds symbols to the owning module’s export table, which we can view with tools like dumpbin, dllexp, or Dependency Walker. Importing works differently, as the closest thing to an imports table is very informal:

  1. When compiling the dependency DLL, the compiler places a .lib next to it. For every dllimport there is a pointer version and the logic to load it. Variables are prefixed __imp_, and functions are given both a pointer and a __imp_ thunk that calls the pointer. Most compilers place the symbol-loading logic to run before main, but some compilers (like .NET) set up symbols to load the first time they are referenced.
  2. When compiling the dependent DLL or executable, all references are replaced with their __imp_ version, and all unreferenced imports are discarded.
  3. When a DLL or executable is loaded, the helper .lib makes sure all referenced DLLs are loaded before looking up each symbol and placing its address into the imports table.

User sees:

// External header

__declspec(dllimport) bool myImportedFunc(int a, char b, size_t c);
__declspec(dllimport) extern bool result;

Compiler generates:

//Symbols from helper .lib
bool (*__imp_myImportedFunc_loc)(int a, char b, size_t c);
bool __imp_myImportedFunc(int a, char b, size_t c)
{
    return __imp_myImportedFunc_loc(a, b, c);
}

bool* __imp_result;

void __onLoad()
{
    HMODULE module = GetModuleHandle("path/to/my.dll");
    if (module == INVALID_HANDLE_VALUE) module = LoadLibrary("path/to/my.dll");
    __imp_myImportedFunc_loc = GetProcAddress(module, "myImportedFunc");
    __imp_result = GetProcAddress(module, "result");
}
//User code

void myDependentFunc()
{
    result = myImportedFunc(1, 'b', 1024);
}
//User code, modified slightly

void myDependentFunc()
{
    *__imp_result = __imp_myImportedFunc(1, 'b', 1024);
}

Imported functions will then be inlined to a single instruction, an unconditional JMP to an indirect location. Depending on your compiler, it might also replace all CALLs to the __imp_ function with that JMP-to-indirect.

bool __imp_myImportedFunc(int,char,unsigned long):
  jmp QWORD ptr [__imp_myImportedFunc_loc]  ; Compiles to an instruction-relative address such as [rip-0x1080]

void myDependentFunc(void):
  push rbp
  mov rbp,rsp
  call QWORD ptr [__imp_myImportedFunc]  ; Call external symbol "myImportedFunc", which stores its return value in RAX
  mov QWORD ptr [__imp_result],rax       ; Write to external symbol "result"
  pop rbp
  ret

Implications

If we were to compare the apparent addresses of myDependentFunc across different DLLs (or main EXE vs DLL), we’d find they don’t match. However, if we did want to extract the implementation addresses for something like debugging or vtable rewriting, we can easily retrieve them thanks to the consistent format of the __imp_ thunk.

We can also analyze these imported function thunks to get the locations of the informal imports table–or we can generate our own thunks for variables or fully inlined __imp_ functions. This would allow us to manually rewrite it, and we can even keep most of our our architecture-dependent logic contained in the disassembler.