Search This Blog

Saturday, September 19, 2009

The Taxonomy of Linking

There are generally at least two steps to building a program, regardless of language. Compiling takes a source code file (operating on one file at a time), parses it, and produces machine code (or some manner of other thing such as Java bytecode) in what's called an object file. Here, there's one object file produced from each source file.

Linking then occurs on the set of object files for a project, producing a single binary. 'Linking' refers to connecting the code between object files, so that functions in one file may access functions and data in other object files, which the compiler couldn't do because it works on one file at a time. The linker also adds the operating system-specific wrapper material needed to allow the OS to load and execute the binary.

That's the general idea, anyway. As with most things, reality is more complicated.

What was just described is known as static linking (or build-time linking). Static linking is when called code is included directly in the binary. It's loaded and unloaded with all the other code in that binary, and as such is always available. The code may either come from object files, as above, or from what are called static libraries - essentially archives containing multiple object files combined into one.

However, there are disadvantages to static linking. Because the code is tightly integrated into the binary, it's effectively impossible to update any functions without recompiling the entire binary. Furthermore, because the code is included in each binary, functions used by many binaries must be duplicated, wasting space.

The alternative to static linking is dynamic linking. In dynamic linking, common functions are built into stand-alone binaries called dynamic link libraries (DLLs). Functions in these DLLs, unlike in static libraries, are not integrated into other binaries by the linker, but instead are called directly from the other binaries. This requires the functions be exported - a process in which the linker writes information into the DLL about what functions the DLL contains and where those functions exist in the DLL; this data is then used by the operating system to allow other binaries to locate and call these exported functions (importing). As DLLs exist independently of binaries that call exported functions, only a single copy of exported functions need exist on disk or in memory, and DLLs may be updated independently of other binaries.

Dynamic linking may further be classified as load-time linking or run-time linking. In load-time linking the linker, when creating a binary that calls functions in DLLs, creates an import table, which lists all DLLs and functions therein that the binary requires. The operating system then, when loading the binary, automatically loads all of these DLLs and finds the addresses of all imported functions, writing the addresses to fixed places in the binary; the binary may then trivially call the functions through these pointers that are at known locations. This is entirely automatic; the DLLs are automatically loaded when the calling binary is, and unloaded when the calling binary is unloaded (assuming no other binaries still need them), ensuring they're always available when needed.

To accomplish this, the linker needs what are known as import libraries. These are generated by the linker when it builds the DLL, and contains a list of all functions exported by the DLL. The reason this is necessary is that DLLs themselves contain only a function name (or ordinal number) - only enough information to allow the operating system to locate the desired function; they may not include information such as what parameters a function takes or the calling convention to use to call the function, which are necessary to seamlessly link to the function. A consequence of this is that both the DLL name and all functions needed must be known when the calling binary is linked.

The other type of dynamic linking is run-time linking. Unlike load-time linking, in which the entire importing process is automatic, the run-time linking process is entirely manual. You must manually load the DLL when you need it, call OS functions to get pointers to the desired functions (which you must manually specify by name/ordinal), manually call the functions through the obtained pointers, making sure you use the right parameters and calling convention, and finally manually unloading the DLL when you no longer need it.

As this is much, much less convenient, you generally use load-time linking whenever possible, reserving run-time linking only for those times when you can't use load-time linking for one reason or another. Generally this is because you don't know the name of the DLL (or the function to import) at link-time. There may be multiple DLLs with different versions of a function (e.g. in Diablo II there are DLLs for the graphics system, with separate DLLs for DirectDraw, Direct3D, and Glide); the DLL may be loaded based on user action (e.g. loading plugins); the DLL may not exist until run-time (e.g. the calling binary must download the DLL first); etc.

(for more general information on DLLs, see The Old New Thing)

Sometimes, however, you get in the unenviable position of not being able to use either. This is the situation I found myself in when linking ThunderGraft to the FMOD Ex DLL. Because the DLL will be packed into a Self-Executing MPQ and extract as a temporary file with a random name, I couldn't use load-time linking (which requires a fixed DLL file name). However, the fact that the DLL was exporting full classes meant that I couldn't use run-time linking either, as the names were too complex to manually specify (with such memorable names as "?createSound@System@FMOD@@QAG?AW4FMOD_RESULT @@PBDIPAUFMOD_CREATESOUNDEXINFO@@PAPAVSound@2@@Z" [spaces inserted so the blog formatting doesn't get all screwed up by the huge string]). Static linking was also out, as the authors do not make a static library of FMOD available (which isn't too surprising as static libraries tend to be compiler- and version-specific).

Fortunately, there's one more option: delay-loading. Delay-loading is essentially a neat linker trick. It uses an import library for a DLL, but instead of generating import tables it generates code that loads the DLL and get the imported function addresses through the run-time linking functions; the DLL is loaded and the function addresses obtained when an imported function is first called (or you tell the linker code to do so at a time of your choosing). This allows the convenience of load-time linking with the flexibility of run-time linking, and allows you to get out of sticky situations like this one.

Of course, in this case it was all due to poor design in FMOD to begin with. It's a well-known principle that you should never export full classes; a much better way is to export pure virtual classes. This avoids the problem because functions in virtual classes are not exported individually, but rather accessed through the v-table; no exports to link to, no names to mangle, etc. It's to avoid this very problem that COM chose to use pure virtual classes as the basis of the COM object system, after all, and there really isn't any better alternative.

No comments: