Here's a problem nearly every C++ programmer has encountered. In your code, you've made a call to a function in some DLL and the linker complains that it can't find the symbol. It usually doesn't take too long to figure out that you need to add another library (.LIB) file to the linker's command line. The only problem is, which .LIB file? From day one, certain formats have remained relatively constant in the Microsoft ® Win32 ® development tools, and many tools have sprung up around them. For example, the Microsoft DUMPBIN utility can be used to display the contents of both Portable Executable files and COFF (Common Object File Format) .OBJ files. (For users of Visual Basic ® 5.0, the command line LINK –dump
is functionally the same as DUMPBIN.) However, when it comes to libraries, there seems to be a real dearth of tools that can intelligently tell you the contents of a COFF format .LIB file. All 32-bit Microsoft tools use COFF. Perhaps you need to know if a function is imported by name versus ordinal value. DUMPBIN isn't much help here. Sure, DUMPBIN has a few obscure command options for .LIB files (/ARCHIVEMEMBERS and /LINKERMEMBER, for example). But they just provide raw output of portions of the .LIB file. A few gurus can cast the runes of DUMPBIN's output to figure out what they're after. However, to really see what's in a .LIB file, you need either a good understanding of .LIB file structures or a tool that displays the .LIB contents in a meaningful manner. In this column, I'll provide some relief on both counts. While mucking about inside .LIB files might appear forbidding, they're really not complicated. Essentially, a .LIB file is just a collection of COFF format .OBJ files strung sequentially together. A table of contents at the beginning tells the linker where things are. Actually, there are two tables of contents, but this detail isn't important for the ensuing discussion. In my July 1997 column, I described the basic principles of how a linker works. The important factoid for this column is that a linker is responsible for resolving symbols between compilation units. For example, if MyFile1.CPP calls function FooBar in another source file, the linker has to locate the .OBJ file containing FooBar's binary code and include it in the finished executable. From the linker's perspective, a .LIB file is just a collection of .OBJ files. The table of contents in a .LIB file is a list of all the symbols from all the .OBJs contained in the library. For each symbol, the table of contents also indicates which .OBJ file the symbol came from. This mapping of a symbol name to an .OBJ file allows the linker to quickly bring in just the .OBJ from the .LIB file that it needs, while ignoring the rest of the library. You might be thinking, "What about import libraries? Aren't they special?" Under the Win32 COFF format, the answer is no. The linker resolves calls to DLL functions the same way as it does for internal (static) functions. The only real difference is that when you call a DLL function, the .OBJ file in the import library provides data for the executable's import table rather than code for the actual function. The data that an import library provides for an imported API is kept in several sections whose names all begin with .idata (for instance, .idata$4, .idata$5, and .idata$6). The .idata$5 section contains a single DWORD that, when the executable loads, contains the address of the imported function. The .idata$6 section (if present) contains the name of the imported function. When loading the executable into memory, the Win32 loader uses this string to call GetProcAddress on the imported function effectively. As I described in the July 1997 column, the linker lumps together sections that have the same name up to, but not including, the $. The portion after the $ is used to order the sections. Thus, all the .idata$4 sections are put in the executable contiguously, followed by all the .idata$5 sections, and finishing with all the .idata$6 sections. The linker's combining and sorting of sections is what builds the import address table (IAT) and other parts of the imports table in a finished executable. Not surprisingly, an executable's imports table is usually in a section that is named .idata. If you've used OLE, COM, or ActiveX ® , you probably remember that there are also .LIB files that are used for predefined class IDs (CLSIDs) and interface IDs (IIDs). Both CLSIDs and IIDs are forms of GUIDs, which are 16-byte unique values. If you poke around in one of these import libraries (for instance, UUID.LIB), you'll see that the GUID values are stored in a section called .rdata. The linker takes all the referenced .rdata sections in the .LIB file and creates the .rdata section in the executable. Put differently, every GUID that you reference in your program reserves 16 bytes in the final executable. The COFF .LIB File Structure Before I explain how a tool can provide an intelligent display of a .LIB file's contents, it's helpful to have a basic understanding of how COFF .LIBs are constructed. The first thing you'll need to tuck away in your memory banks is that in COFF the words "archive" and "library" are used interchangeably. The second tidbit to remember is that components of a .LIB file are referred to as members. Thus, a .LIB file is really just a series of contiguous archive members. With two exceptions that I'll get to momentarily, each archive member corresponds to an .OBJ file. All COFF .LIB files begin with an 8-byte header, which reads "!<arch>/n" when viewed as ASCII text. You can see this in WINNT.H as the #define for IMAGE_ARCHIVE_START. Following this header is the first of potentially many archive members. Each archive member begins with a structure called an IMAGE_ARCHIVE_MEMBER_ HEADER, which is also defined in WINNT.H. This structure contains information such as the member's name and size. Interestingly, one of the strings in an archive member header is in the octal number format. Yes, these throwbacks to computing's infancy continue to rattle around in today's supercharged barn-burners. The first two archive members in a COFF .LIB file are special. Instead of .OBJ files, they act as a table of contents to the other archive members (that is, to the .OBJs). These are called linker members (see the IMAGE_ARCHIVE_LINKER_MEMBER #define in WINNT.H). These members map a symbol name (for instance, _CreateProcessA@40) to the offset of the archive member containing the code or data associated with that symbol. The two special linker members both contain the same information. The only difference is in how the symbol names are sorted. Figure 1 shows the format of a names linker member. Following the IMAGE_ARCHIVE_MEMBER_HEADER is a DWORD with the number of symbols in the library. Next is an array of DWORD offsets to other archive members in the library. Following the DWORD array is a series of null-terminated symbol name strings. Each successive entry in the DWORD array corresponds to the next string in the string table.
| Figure 1 Names Linker Member | | Figure 2 Archive Member | The format of the other non-names archive members is even simpler. It's just an archive member header, followed by an .OBJ file. If you're not familiar with the layout of an .OBJ file, it consists of an IMAGE_FILE_HEADER followed by one or more IMAGE_SECTION_HEADER structures, one for each code or data section. Next comes the raw code and data for the sections. Bringing up the rear is the symbol table, which correlates symbol names to specific locations in the .OBJ's code and data. All of these data structures are the same as those used in executable files, and are described in WINNT.H. Figure 2 shows the layout of one of these .OBJ-based archive members. Inside LibDump If you really understand everything I just described, you could use DUMPBIN with the /ALL option to figure out anything you might want to know about a .LIB file. For example, if you needed to know what the import ordinal for the CreateUpDownControl API is, you'd run DUMPBIN /ALL on COMCTL32.LIB. In the beginning of DUMPBIN's output, you'd find the string "CreateUpDownControl". On the same line would be the offset of the matching .OBJ file. You'd then search the dump output for the archive member at that file offset. Somewhere within the information for that .OBJ, you'd locate the raw data for .idata$5, which reads: |