The following is an excerpt from the book Expert C Programming: Deep C Secrets by Peter van der Linden. (Englewood Cliffs: Prentice-Hall/Sun Microsystems Press, 1994, ISBN 0-13-177429-8, pages 118-123).

I agree with the author when he says, "Expert C Programming should be every programmer's second book on C."

The section reproduced here contains information that will be appreciated by software developers who are making the transition to UNIX after having learned to write C programs in some other environment.

...RSS


Five Special Secrets of Linking with Libraries

There are five essential, non-obvious conventions to master when using libraries. These aren't explained very clearly in most C books or manuals, probably because the language documenters consider linking to be part of the surrounding operating system, while the operating system people view linking as part of the language. As a result, no one makes much more than a passing reference to it unless someone from the linker team gets involved!

Here are the essential UNIX linking facts of life:


  1. Dynamic libraries are called libsomething.so, and static libraries are called libsomething.a

    By convention, all dynamic libraries have a filename of the form libname.so (version numbers may be appended to the name). Thus the library of thread routines is called libthread.so.


  2. You tell the compiler to link with, for example, libthread.so by giving the option -lthread

    The command line argument to the C compiler doesn't mention the entire pathname to the library file. It doesn't even mention the full name of the file in the library directory! Instead, the compiler is told to link against a library with the command line option -lname where the library is called libname.so--in other words, the "lib" part and the file extension are dropped, and -l is jammed on the beginning instead.


  3. The compiler expects to find the libraries in certain directories.

    At this point, you may be wondering how the compiler knows in which directory to look for the libraries. Just as there are special rules for where to find header files, so the compiler looks in a few special places such as /usr/lib/ for libraries. For instance, the threads library is in /usr/lib/libthread.so.

    The compiler option -Lpathname is used to tell the linker a list of other directories in which to search for libraries that have been specified with the -l option. There are a couple of environment variables, LD_LIBRARY_PATH and LD_RUN_PATH, that can also be used to provide this information. Using these environment variables is now officially frowned on, for reasons of security, performance, and build/execute independence. Use the -Lpathname -Rpathname options at linktime instead.


  4. Identify your libraries by looking at the header files you have used.

    Another key question that may have occurred to your is, "How do I know which libraries I have to link with?" The answer, as (roughly speaking) enunciated by Obi-Wan Kenobi in Star Wars, is, "Use the source, Luke!" If you look at the source of your program, you'll notice routines that you call, but which you didn't implement. For example, if your program does trigonometry, you've probably called routines with names like sin() or cos(), and these are found in the math library. The manpages show the exact argument types each routine expects, and should mention the library it's in.

    A good hint is to study the #includes that your program uses. Each header file that you include potentially represents a library against which you must link. This tip carries over into C++, too. A big problem of name inconsistency shows up here. Header files usually do not have a name that looks anything like the name of the corresponding library. Sorry! This is one of the things you "just have to know" to be a C wizard. Table 5-1 shows examples of some common ones.

    #include Filename Library Pathname Compiler Option to Use
    <math.h> /usr/lib/libm.so -lm
    <math.h> /usr/lib/libm.a -dn -lm
    <stdio.h> /usr/lib/libc.so linked in automatically
    "/usr/openwin/include/X11.h" /usr/openwin/lib/libX11.so -L/usr/openwin/lib -lX11
    <thread.h> /usr/lib/libthread.so -lthread
    <curses.h> /usr/lib/libcurses.a -lcurses
    <sys/socket.h> /usr/lib/libsocket.so -lsocket

    Another inconsistency is that a single library may contain routines that satisfy the prototypes declared in multiple header files. For example, the functions declared in the header files string.h, stdio.h, and time.h are all usually supplied in the single library libc.so. If you're in doubt, use the nm utility to list the routines that a library contains. More about this in the next heuristic!

    Handy Heuristic: How to Match a Symbol with its Library

    If you're trying to link a program and get this kind of error:

        ld: Undefined symbol
            _xdr_reference
        *** Error code 2
        make: Fatal error: Command failed for target 'prog'
    

    Here's how you can locate the libraries with which you need to link. The basic plan is to use nm to look through every library in /usr/lib, grepping for the symbols you're missing. The linker looks in /usr/ccs/lib and /usr/lib by default, and so should you. If this doesn't get results, extend your search to all other library directories (such as /usr/openwin/lib), too.

        $ /usr/bin/csh
        % cd /usr/lib
        % foreach i (lib?*)
        ? echo $i
        ? nm $i | grep xdr_reference | grep -v UNDEF
        ? end
        . . .
        libc.so
        libc.so.1
        libnsl.so
        [2491]  |    217028|     196|FUNC |GLOB |0    |8     |xdr_reference
        libposix4.so
        . . .
        % exit
        $ 
    

    This runs "nm" on each library in the directory, to list the symbols known in the library. Pipe it through grep to limit it to the symbol you are searching for, and filter out symbols marked as "UNDEF" (referenced, but not defined in this library). The result shows you that xdr_reference is in libnsl. You need to add -lnsl on the end of the compiler command line.



  5. Symbols from static libraries are extracted in a more restricted way than symbols from dynamic libraries

    Finally, there's an additional and big difference in link semantics between dynamic linking and static linking that often confuses the unwary. Archives (static libraries) are acted upon differently than are shared objects (dynamic libraries). With dynamic libraries, all the library symbols go into the virtual address space of the output file, and all the symbols are available to all the other files in the link. In contrast, static linking looks only through the archive for the undefined symbols presently known to the loader at the time the archive is processed.

    A simpler way of putting this is to say that the order of the statically linked libraries on the compiler command line is significant. The linker is fussy about where libraries are mentioned, and in what order, since symbols are resolved looking from left to right. This makes a difference if the same symbol is defined differently in two different libraries. If you're doing this deliberately, you probably know enough not to need to be reminded of the perils.

    Another problem occurs if you mention the static libraries before your own code. There won't be any undefined symbols yet, so nothing will be extracted. Then, when your object file is processed by the linker, all its library references will be unfulfilled! Although the convention has been the same since UNIX started, many people find it unexpected; very few commands demand their arguments in a particular order, and those that do usually complain about it directly if you get it wrong. All novices have trouble with this aspect of linking until the concept is explained. Then they just have trouble with the concept itself.

    The problem most frequently shows up when someone links with the math library. The math library is heavily used in many benchmarks and applications, so we want to squeeze the last nanosecond of runtime performance out of it. As a result, libm has often been a statically linked archive. So, if you have a program that uses some math routines such as the sin() function, and you link statically like this:

        cc -lm main.c
    

    you will get an error message like this:

        Undefined                     first referenced
         symbol                           in file
         sin                              main.o
        ld: fatal: Symbol referencing errors.  No output written to a.out
    

    In order for the symbols to get extracted from the math library, you need to put the file containing the unresolved references first, like so:

        cc main.c -lm
    

    This causes no end of angst for the unwary. Everyone is used to the general command form of <command> <options> <files>, so to have the linker adopt the different convention of <command> <files> <options> is very confusing. It's exacerbated by the fact that it will silently accept the first version and do the wrong thing. At one point, Sun's compiler group amended the compiler drivers so that they coped with the situation. We changed the SunOS 4.x unbundled compiler drivers from SC0.0 through SC2.0.1 so they would "do the right thing" if a user omitted -lm. Although it was the right thing, it was different from what AT&T did, and broke our compliance with System V Interface Definition; so the former behavior had to be reinstated. In any case, from SunOS 5.2 onwards, a dynamically linked version of the math library /usr/lib/libm.so is provided

    Handy Heuristic: Where to Put Library Options

    Always put the -l library options at the rightmost end of your compilation command line.


    Similar problems have been seen on PC's, where the Borland compiler drivers tried to guess whether floating-point libraries needed to be linked in. Unfortunately, they sometimes guessed wrongly, leading to the error:

        scanf: floating point formats not linked
        Abnormal program termination
    

    They seem to guess wrongly when the program uses floating-point formats in scanf() or printf() but doesn't call any other floating-point routines. The workaround is to give the linker more of a clue, by declaring a function like this in a module that will be included in the link:

        static void forcefloat( float *p )
        { float f = *p; forcefloat( &f ); }
    

    You don't actually have to call the function, merely ensure that it is linked in. This provides a solid enough clue to the Borland PC linker that the floating-point library really is needed.

    NB: a similar message, saying "floating point not loaded" is printed by the Microsoft C runtime system when the software needs a numeric coprocessor but your computer doesn't have one installed. You fix it by relinking the program, using the floating-point emulation library.


The "-D" option was introduced with SunOS 5.3 (Solaris 2.3) to provide better link-editor debugging. The option (fully documented in the Linker and Libraries Manual) allows the user to display the link-editing process and input file inclusion. It's especially useful for monitoring the extraction of objects from archives. It can also be used to display runtime bindings.


Chapter 3 of this book, "Unscrambling Declarations", may be read online here.

This same excerpt "Five Special Secrets of Linking with Libraries" may also be read online from O'Reilly & Associates here.


Additional linking-related resources: