By Fabien Sanglard
May 3rd, 2023
driver | |||
cpp | cc | ld* | loader |
The goal of the linker is to merge all relocatable sections together and create something the OS loader can load for execution. Since we are going to talk about it a lot on this page, let's clarify what relocation means, by quoting elf(5)
.
Relocation is the process of connecting symbolic references with symbolic definitions. Relocatable files must have information that describes how to modify their section contents, thus allowing executable and shared object files to hold the right information for a processes' program image. Relocation entries are these data. - elf(5)
The linker starts by picking sections in the relocatable(s) generated by the compiler and merges them together. Along the way, it patches in missing symbols from static libraries and emits relocation information for symbols imported from dynamic libraries.
.a
is nothing else but a collection of relocatable .o
. It is built using ar
(for archiver) command.
$ clang -c x.c y.c
$ ar -rv foolib.a x.o y.o
In the good old days you needed to run ranlib
on it in order to build an index which speeds up the linking process. Nowadays the default behavior of ar
was changed to build this index by default.
Again, this article is only a high-level overview. If you want to deepen your knowledge of linkers, an excellent book on the topic is Linkers and Loaders by John R. Levine.
On Linux the output format is an ELF file (the same as the input). However using readelf
we can see that whereas compiler outputs only featured sections, linker outputs also feature segments. Segments are used to point and group sections together. These two views are called Linking View (sections) and Execution View (segments).
Let's compile hello.c
and peek inside a.out
.
// hello.c #include <stdio.h> int main() { printf("Hello, World!"); return 0; }
Flag -l
in readelf
requests to show the segment (a.k.a "program headers") instead of the sections.
$ clang -v hello.c clang -cc1 -o /tmp/hello-9c2163.o hello.c /usr/bin/ld -o a.out /tmp/hello-9c2163.o /lib/crti.o -L/lib -lc -lgcc $ file a.out a.out: ELF 64-bit LSB pie executable, ARM aarch64, version 1 (SYSV), interpreter /lib/ld-linux-aarch64.so.1 $ readelf -l -W a.out Elf file type is DYN (Position-Independent Executable file) Entry point 0x8c0 There are 9 program headers, starting at offset 64 Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align PHDR 0x000040 0x0000000000000040 0x0000000000000040 0x0001f8 0x0001f8 R 0x8 INTERP 0x000238 0x0000000000000238 0x0000000000000238 0x00001b 0x00001b R 0x1 [Requesting program interpreter: /lib/ld-linux-aarch64.so.1] LOAD 0x000000 0x0000000000000000 0x0000000000000000 0x0008ac 0x0008ac R E 0x10000 LOAD 0x000dc8 0x0000000000010dc8 0x0000000000010dc8 0x000270 0x000278 RW 0x10000 DYNAMIC 0x000dd8 0x0000000000010dd8 0x0000000000010dd8 0x0001e0 0x0001e0 RW 0x8 NOTE 0x000254 0x0000000000000254 0x0000000000000254 0x000044 0x000044 R 0x4 GNU_EH_FRAME 0x0007b0 0x00000000000007b0 0x00000000000007b0 0x00003c 0x00003c R 0x4 GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW 0x10 GNU_RELRO 0x000dc8 0x0000000000010dc8 0x0000000000010dc8 0x000238 0x000238 R 0x1 Section to Segment mapping: Segment Sections... 00 01 .interp 02 .interp .gnu.hash .dynsym .dynstr rela.dyn .rela.plt .init .plt .text .fini .rodata .eh_frame 03 .init_array .fini_array .dynamic .got .got.plt .data .bss 04 .dynamic 05 .note.gnu.build-id .note.ABI-tag 06 .eh_frame_hdr 07 08 .init_array .fini_array .dynamic .got
The program headers instruct where group of sections are in the ELF file (PhysAddr
) and where they should be mapped in virtual memory (VirtAddr
) by the loader.
As the verbose trace above shows, clang
driver invoked itself to compile the source file and then called /usr/bin/ld
to link an executable.
There are many linkers available on Linux. The first one available on the platform was GNU's, commonly called ld
. Later came gold
which was built to improve speed. LLVM also released their own linker called lld
.
The path /usr/bin/ld
is not enough to tell which one it is. But we can dig a little bit.
$ ll /usr/bin/ld lrwxrwxrwx 1 root root 20 Nov 2 13:58 /usr/bin/ld -> aarch64-linux-gnu-ld* $ /usb/bin/ld --version GNU ld (GNU Binutils for Ubuntu) 2.38
The linking stage is a bottleneck in the compilation pipeline. Contrary to the compiler which can be run in parallel on each translation unit and whose outputs can be cached between runs, the linker must wait until all object files are ready to start linking.
As a result, significant optimization have targeted the linker. Efforts such as
The most important optimization is called "Incremental Linking". It consists in re-using work done during the previous linking operation. Few linkers can do it. GNU's ld
, LLVM's lld
, and Apple's ld64
can't do it.
gold
can do it, but only if you pass a special linker flag, which typical build systems don't. Microsoft's LD.EXE
can also do it when given a special flag /INCREMENTAL
.
-L
and -l
, by the driver.
$ clang -v hello.c clang -cc1 -o /tmp/hello-9a2af8.o hello.c ld -o a.out \ -L/usr/lib/gcc/aarch64-linux-gnu/11 \ -L/lib/aarch64-linux-gnu \ -L/usr/lib/aarch64-linux-gnu \ -L/usr/lib/llvm-14/lib \ -L/lib \ -L/usr/lib \ \ -lgcc \ -lgcc_s \ -lc \ \ /usr/lib/gcc/aarch64-linux-gnu/11/crtendS.o /lib/aarch64-linux-gnu/crtn.o /tmp/hello-9a2af8.o
In the trace above, the linker is provided with six folders in red, three dynamic libraries in blue, and must link together the objects passed extra parameters in green.
lib
and suffixed with the dynamic library extension (on Linux .so
) when looked up on the filesystem. Therefore you won't find a file at /lib/aarch64-linux-gnu/c
but you will find /lib/aarch64-linux-gnu/libc.so
.libc.so
, you will find out that it is not an ELF file. It is an ASCII text file.
$ file /lib/aarch64-linux-gnu/libc.so /lib/aarch64-linux-gnu/libc.so: ASCII text
This text file is a linker script which points to /lib/aarch64-linux-gnu/libc.so.6
.
There are two types of library linking, named static and dynamic. As we saw earlier, a static library is nothing but a collection of object files packaged in a .a
archive. These objects are included in the final binary.
Linking against a dynamic library is different. The linker looks up the dynamic library symbols but does not pull them into the final binary. Instead it emits a special section dynsym
which lists the name of symbols to be found at runtime, along with a list of dynamic library names where they may be in section .dynamic
. We can see the dynamic library an executable needs with either readelf
or ldd
$ clang -o hello hello.c
$ readelf -d hello| grep NEEDED
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
$ ldd hello
linux-vdso.so.1 (0x0000ffff85df9000)
libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000ffff85bd0000)
/lib/ld-linux-aarch64.so.1 (0x0000ffff85dc0000)
Notice the output of ldd
resolves where the library are on the system. It also includes the interpreter path, we will get to this in the next chapter.
Using readelf
, we can see how the imported symbols are suffixed with the name of the dynamic library. The matching library also feature the same suffix in its exported symbols. If the dynamic library has a version, this is also where it is featured (e.g: GLIBC_2.17 here).
$ readelf -s hello Symbol table '.dynsym' contains 10 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 00000000000005b8 0 SECTION LOCAL DEFAULT 11 .init 2: 0000000000011028 0 SECTION LOCAL DEFAULT 23 .data 3: 0000000000000000 0 FUNC GLOBAL DEFAULT UND _[...]@GLIBC_2.34 (2) 4: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _ITM_deregisterT[...] 5: 0000000000000000 0 FUNC WEAK DEFAULT UND _[...]@GLIBC_2.17 (3) 6: 0000000000000000 0 NOTYPE WEAK DEFAULT UND __gmon_start__ 7: 0000000000000000 0 FUNC GLOBAL DEFAULT UND abort@GLIBC_2.17 (3) 8: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _ITM_registerTMC[...] 9: 0000000000000000 0 FUNC GLOBAL DEFAULT UND printf@GLIBC_2.17 (3)
$ readelf -s /lib/aarch64-linux-gnu/libc.so.6 | grep printf Symbol table '.symtab' contains 90 entries: Num: Value Size Type Bind Vis Ndx Name 60: 000000000006cfe0 168 FUNC GLOBAL DEFAULT 12 swprintf@@GLIBC_2.17 259: 000000000006d090 56 FUNC GLOBAL DEFAULT 12 vwprintf@@GLIBC_2.17 437: 0000000000072184 40 FUNC WEAK DEFAULT 12 vasprintf@@GLIBC_2.17 578: 0000000000050cb0 168 FUNC GLOBAL DEFAULT 12 dprintf@@GLIBC_2.17 761: 0000000000050920 168 FUNC GLOBAL DEFAULT 12 fprintf@@GLIBC_2.17 1137: 0000000000050d60 40 FUNC WEAK DEFAULT 12 vfwprintf@@GLIBC_2.17 1188: 0000000000050c00 168 FUNC WEAK DEFAULT 12 asprintf@@GLIBC_2.17 1302: 0000000000072530 40 FUNC WEAK DEFAULT 12 vsnprintf@@GLIBC_2.17 1401: 0000000000072350 40 FUNC WEAK DEFAULT 12 vdprintf@@GLIBC_2.17 1561: 000000000004be40 40 FUNC GLOBAL DEFAULT 12 vfprintf@@GLIBC_2.17 1911: 0000000000050b40 180 FUNC GLOBAL DEFAULT 12 sprintf@@GLIBC_2.17 1930: 000000000006cf30 168 FUNC WEAK DEFAULT 12 fwprintf@@GLIBC_2.17 2123: 0000000000050a90 168 FUNC WEAK DEFAULT 12 snprintf@@GLIBC_2.17 2146: 000000000006d4c0 40 FUNC WEAK DEFAULT 12 vswprintf@@GLIBC_2.17 2229: 000000000004be70 56 FUNC GLOBAL DEFAULT 12 vprintf@@GLIBC_2.17 2315: 000000000006d0d0 188 FUNC GLOBAL DEFAULT 12 wprintf@@GLIBC_2.17 2837: 000000000006ba70 204 FUNC WEAK DEFAULT 12 vsprintf@@GLIBC_2.17 2841: 00000000000509d0 188 FUNC GLOBAL DEFAULT 12 printf@@GLIBC_2.17
Notice the WEAK
binding of some symbols which we discussed earlier.
While we are on the topic of linker symbol resolution, you should *really* take a few minutes to read Eli Bendersky's explanation of linking order in static libraries. In fact, his whole website is a gem which partially inspired this series.
What happens if the function where the program starts, main
, is mistakenly named maib
.
// hello.c
#include <stdio.h>
int maib() {
printf("Hello, World!");
return 0;
}
Let's try to compile it.
$
clang mainb.c
/usr/bin/ld: /lib/aarch64-linux-gnu/Scrt1.o: in function `_start':
(.text+0x1c): undefined reference to `main'
/usr/bin/ld: (.text+0x20): undefined reference to `main'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
The linking fails because a mysterious object Scrt1.o
features a function _start
which calls main
. That's because the execution of a program does not really begin at main
. There are many things to set up before a program can run, among other things the stack must be initialized and the program arguments prepared.
In our example the piece of assembly in charge of initialization is called Scrt1.s
. Only when everything is ready, the function __start
calls main
,
Scrt1.s
can also sometimes be found named ctr0
. In both cases, the name is derived from C RunTime.
Likewise, a program execution does not end after main
returns. It is easy to verify using atexit
function which is executed by the C runtime after main returns.
// atexit.c #include <stdio.h> #include <stdlib.h> void bye(void) { puts("Goodbye, cruel world...."); } int main(void) { atexit(bye); puts("This is the last function call"); return 0; }
Let's see the outputs
$ clang atexit.c $ ./a.out This is the last function call Goodbye, cruel world....
If you feel like going even deeper on the topic of C runtime, make sure to read the Tutorial on Creating Teensy ELF Executables.
Let's say we have a project with three source files. One of them hold a "singleton" char
variable named c
.
// main.c
#include "stdio.h"
char getChar();
void setChar(char ch);
int main() {
setChar('a');
putc(getChar())
}
|
// static.c
char c = 'b';
char getChar() {
return c;
}
|
// dynamic.c extern char c; void setChar(char ch) { c = ch; } |
We build the project as an object, a static library, and a dynamic library.
$ clang -o static.o -c static.c $ ar rcs libmyStatic.a static.o $ clang -o libmyShared.so -shared -lmyStatic dynamic.c $ clang -o main -lmyShared -lmyStatic main.c
The dependency graph looks as follows.
What is the program going to display when it runs? Will it be a
, b
, or 42
?
$ ./main.c
b
main
calls setChar
to set the value of c
to 'a' and then prints this very variable it just set. The output expected is therefore 'a'. But when we run, we see 'b' being printed.
This happened because the static library was linked twice. There are two copies of the variables c
in the final program. One that is read by getChar()
and another one which is written by setChar
. As much as possible if you are designing a complex project, try to stick to static libraries.
Some error originate at the compiler level but surface at the linker level. This is the case for the beginners' dreaded "duplicate symbol" (a.k.a LNK4002 in the Windows/Visual Studio world). Here is a mini-project to show the problem.
// counter.h
#pragma once
int counter = 0;
int incCounter();
|
|
// counter.c #include "counter.h" void incCounter() { counter++; } |
// main.c #include <stdio.h> #include "counter.h" int main() { incCounter(); printf("%d\n", counter); } |
This is a simple program with a main part and a counter part. It fails to compile.
$ clang counter.c main.c
1 warning generated.
duplicate symbol '_counter' in:
/var/folders/sp/tmp/T/counter-c84ff0.o
/var/folders/sp/tmp/T/cmain-3e41f8.o
ld: 1 duplicate symbol for architecture x86_64
Let's inspect what is going on. First at the translation unit level and then at the symbol level.
$ clang -E -o counter.tu counter.c
$ cat counter.tu
int counter = 0;
int incCounter();
int incCounter() {
counter++;
}
|
$ clang -E -o main.tu main.c
$ cat main.tu
int counter = 0;
int incCounter();
int main() {
printf("%d\n", counter);
}
|
Let's look at the symbols now.
$ clang -c -o counter.o counter.c
$ nm counter.o
0000000000000000 B counter
0000000000000000 T incCounter
|
$ clang -c -o main.o main.c
$ nm main.o
0000000000000000 B counter
0000000000000000 T main
U printf
|
Due to the siloed nature of the translation unit, the compiler will happily produce object files, only for the linker to scream bloody murder when it finds duplicate symbols (like in our example counter
) without a way to know which one to use.
Avoid these kinds of errors by never defining anything in a header. Headers should only contain declarations, and only expose the strict minimum. If you need to share a storage symbol, use extern
.
There is a certain level of trust when the linker combines object files. For example there is no verification that imported and exported symbol types match.
// trick.c
#include <stdio.h>
extern short i;
int main() {
printf("i=%d\n", i);
return 0;
}
|
// i.c
const char* i = "a string!";
|
The defined type and the declared type of i
did not match but the linker happily combined the object files.
$ clang trick.c i.c
$ ./a.out
2034
In the compiler page, the "Section Management" part mentioned how to create one section per symbol. This is usually used in conjunction with linker flags to bring in the final product only what is needed. This is achievable by providing the compiler driver with flags for the linker.
$ clang -v -ffunction-sections -fdata-sections -Wl,--gc-sections -Wl,--as-needed main clang -cc1 -o /tmp/main-476f21.o -x c main.c ld --gc-sections --as-needed /tmp/main-476f21.o
The executable size reduction will vary depending on the project and translation units structures.
$ clang -v -ffunction-sections -fdata-sections -Wl,--gc-sections -Wl,--as-needed main $ ll a.out -rwxrwxr-x 1 leaf leaf 8840 Apr 4 22:53 a.out* $ clang main.c $ ll a.out -rwxrwxr-x 1 leaf leaf 9064 Apr 4 22:56 a.out*
The output of the linker is configured by a linker script. It is a powerful mechanism allowing among other things to tell where each section should go in the output file and where they should be mapped in memory by the loader.
Linkers such as ld
have default script (visible with the command ld --verbose
) and users don't have to worry about it. Using custom scripts is mandatory for toolchains targeting machines with exotic memory mapping.
Let's take the example of ccps
, a toolchain to compile for Capcom CPS-1 (arcade machines of the early 90s). The (partial) memory mapping expected by the hardware is as follows.
Address | Purpose |
---|---|
0x000000-0x3FFFFF | ROM |
0x900000-0x92FFFF | GFXRAM |
0xFF0000-0xFFFFFF | RAM |
ccps
achieves this mapping with the following linker script.
// cps1 Linker Script OUTPUT_FORMAT("binary") OUTPUT_ARCH(m68k) ENTRY(_start) MEMORY { rom (rx) : ORIGIN = 0x000000, LENGTH = 0x200000 gfx_ram(rw) : ORIGIN = 0x900000, LENGTH = 0x2FFFF ram(rw) : ORIGIN = 0xFF0000, LENGTH = 0xFFFF }
First three memory regions are created, with offset and size. Then sections are mapped to memory regions.
SECTIONS { .text : { *(.text) *(.text.*) . = ALIGN(4); } > rom .rodata : { *(.rodata) *(.rodata.*) . = ALIGN(4); } > rom .gfx_data : { } > gfx_ram .bss : { __bss_start = .; *(.bss) *(.bss.*) _end = .; . = ALIGN(4); } > ram .data : { *(.data) *(.data.*) . = ALIGN(4); } > ram }