By Fabien Sanglard
May 3rd, 2023
driver | |||
cpp | cc* | ld | loader |
The compiler stage is the most complicated element in the pipeline. Because of the purpose of these articles, this is going to be the simplest part. If you want to learn how compilers work inside out, refer to the Dragon book.
The goal of the compiler is to open a translation unit, parse it, optimize it and output an object file (except in the case of LTO which is discussed later). These object files are also sometimes called relocatable.
All compilers are structured the same way with.
gcc
generates assembly and converts it to machine code with binutils
's as
while clang
has the assembling fully built-in.The machine code output is packaged into an object file format container.
rustc
, which is a LLVM frontend in charge of generating LLVM IR.The input format, the translation unit, was studied in the previous section about the preprocessor. Let's focus on what the compiler has to output. The format is given to us via the tool file
after requesting the driver to output a relocatable file instead of an executable.
// mult.c int mul(int x, int y); int pow(int x) { return mul(x, x) ; }
Note how using -c
flag simply made the driver call itself in compiler mode (-cc1
) and skip the linker stage.
$ clang -v -c mult.c -o mult.o clang -cc1 mult.c -o mult.o $ file mult.o mult.o: ELF 64-bit LSB relocatable, ARM aarch64, version 1 (SYSV), not stripped
The relocatable files are commonly called "object" file and use a .o
extension. Let's use binutils
's readelf
to peek inside it.
$ readelf -S -W mult.o
There are 9 section headers, starting at offset 0x1d8:
Section Headers:
[Nr] Name Type Address Off Size ES Flg Lk Inf Al
[ 0] NULL 0000000000000000 000000 000000 00 0 0 0
[ 1] .strtab STRTAB 0000000000000000 0001b1 000071 00 0 0 1
[ 2] .text PROGBITS 0000000000000000 000040 000028 00 AX 0 0 4
[ 3] .rela.text RELA 0000000000000000 000180 000018 18 I 9 2 8
[ 4] .comment PROGBITS 0000000000000000 000068 000026 01 MS 0 0 1
[ 5] .note.GNU-stack PROGBITS 0000000000000000 00008e 000000 00 0 0 1
[ 6] .eh_frame PROGBITS 0000000000000000 000090 000030 00 A 0 0 8
[ 7] .rela.eh_frame RELA 0000000000000000 000198 000018 18 I 9 6 8
[ 8] .llvm_addrsig LOOS+0xfff4c03 0000000000000000 0001b0 000001 00 E 9 0 1
[ 9] .symtab SYMTAB 0000000000000000 0000c0 0000c0 18 1 6 8
The output is organized in named sections. The most important one to know is .text
, where the functions instructions are stored. We can experiment with the source code to see the two other most common sections.
// manySymbols.c int myInitializedVar = 1; int myUnitializedVar; int add(int x, int y); int mult(int x) { return add(x, x) ; }
Let's compile to a relocatable object and peek inside again.
$ clang -c -o manySymbols.o manySymbols.c $ readelf -S -W manySymbols.o There are 12 section headers, starting at offset 0x2c0: Section Headers: [Nr] Name Type Address Off Size ES Flg Lk Inf Al [ 0] NULL 0000000000000000 000000 000000 00 0 0 0 [ 1] .strtab STRTAB 0000000000000000 000219 0000a5 00 0 0 1 [ 2] .text PROGBITS 0000000000000000 000040 000028 00 AX 0 0 4 [ 3] .rela.text RELA 0000000000000000 0001e8 000018 18 I 11 2 8 [ 4] .data PROGBITS 0000000000000000 000068 000004 00 WA 0 0 4 [ 5] .bss NOBITS 0000000000000000 00006c 000004 00 WA 0 0 4 [ 6] .comment PROGBITS 0000000000000000 00006c 000026 01 MS 0 0 1 [ 7] .note.GNU-stack PROGBITS 0000000000000000 000092 000000 00 0 0 1 [ 8] .eh_frame PROGBITS 0000000000000000 000098 000030 00 A 0 0 8 [ 9] .rela.eh_frame RELA 0000000000000000 000200 000018 18 I 11 8 8 [10] .llvm_addrsig LOOS+0xfff4c03 0000000000000000 000218 000001 00 E 11 0 1 [11] .symtab SYMTAB 0000000000000000 0000c8 000120 18 1 8 8
The addition of an initialized variable made the compiler use a .data
section. The addition of an uninitialized variable made the compiler use a .bss
section.
A relocatable lists both export symbols and import symbols. These lists are in the .symtab
sections, which refers to strings in the .strtab
section.
$ // importExport.c extern const int myConstant; extern void foo(int x); int myVar1; int myVar2; void bar() { foo(myConstant); }
Let's look at the exported and imported symbols with nm
.
$ clang -c importExport.c -o importExport.o
$ nm importExport.o
0000000000000000 T bar
U foo
U myConstant
0000000000000000 B myVar1
0000000000000004 B myVar2
As expected we find three symbols exported, a function bar
(with an offset in .text
of 0x0
) and two uninitialized variables in the bss
section. Variable myVar1
is at offset 0x0
and myVar2
is four bytes further at offset 0x4
.
We also see two undefined (a.k.a imported) symbols, foo
and myConstant
with the U
type. These obviously don't have an offset. The complete list of nm
letter codes and their meaning is as follows.
A A global, absolute symbol. B A global "bss" (uninitialized data) symbol. C A "common" symbol, representing uninitialized data. D A global symbol naming initialized data. N A debugger symbol. R A read-only data symbol. T A global text symbol. U An undefined symbol. V A weak object. W A weak reference. a A local absolute symbol. b A local "bss" (uninitialized data) symbol. d A local data symbol. r A local read-only data symbol. t A local text symbol. v A weak object that is undefined. w A weak symbol that is undefined. ? None of the above.
We can write a rainbow source file which hits as many types of symbols as possible when compiled to object.
extern int undVar; // Should be U int defVar; // Should be B extern const int undConst; // Should be U const int defConst = 1; // Should be R extern int undInitVar; // Should be U int defInitVar = 1; // Should be D static int staticVar; // Should be b static int staticInitVar=1; // Should be d static const int staticConstVar=1; // Should be r static void staticFun(int x) {} // Should be t extern void foo(int x); // Should be U void bar(int x) { // Should be T foo(undVar); staticFun(undConst); }
Since we are using an OS with two great compilers available, we can compile with both gcc
and clang
to see the differences.
$ clang -c rainbow.c -o rainbow.o && nm rainbow.o
0000000000000000 T bar
0000000000000000 R defConst
0000000000000000 D defInitVar
0000000000000000 B defVar
U foo
000000000000003c t staticFun
U undConst
U undVar
$ gcc -c rainbow.c -o rainbow.o && nm rainbow.o
0000000000000014 T bar
0000000000000000 R defConst
0000000000000000 D defInitVar
0000000000000000 B defVar
U foo
0000000000000004 r staticConstVar
0000000000000000 t staticFun
0000000000000004 d staticInitVar
0000000000000004 b staticVar
U undConst
U undVar
nm
outputs differentiate between local and global symbols. A local symbol is only visible within a relocatable unit. In C, this is achieved with a static
storage class specifier.
Global are visible to all relocatable units. It is something that is revisited in the linker article.
nm
output also differentiates between "strong" symbols (the default) and weak symbols.
A weak symbol can be overwritten by a strong symbol.
// weak.c
#include "stdio.h"
extern int getNumber();
int main() {
printf("%d\n", getNumber());
}
|
// number1.c
int getNumber() {
return 1;
}
|
// number2.c
int getNumber() {
return 2;
}
|
By default all symbols are strong. In this example, the linker fails because it does not know which getNumber
to pick when it is used in weak.c
.
$ clang -o weak weak.c number1.c number2.c
/usr/bin/ld: number2.o: in function `getNumber':
number2.c:(.text+0x0): multiple definition of `getNumber'); number1.o:number1.c:(.text+0x0): first defined here
clang: error: linker command failed with exit code 1 (use -v to see invocation
If we declare one of the duplicate functions as weak
, the program compiles and run normally, regardless of the compilation and linking order.
// weak.c #include "stdio.h" extern int getNumber(); int main() { printf("%d\n", getNumber()); } |
// number1.c
__attribute__((weak)) int getNumber() {
return 1;
}
|
// number2.c
int getNumber() {
return 2;
}
|
$ clang -o weak weak.c number1.c number2.c $./weak 2 $ clang -o weak weak.c number2.c number1.c $./weak 2
Most libc
implementations declare their methods "weak" so users can intercept them. This is not always as convenient as it seems. Let's look at how to intercept malloc
.
// mymalloc.c #define _GNU_SOURCE // Could have been defined with -D on command-line #include "stddef.h" #include "dlfcn.h" #include "stdio.h" #include "stdlib.h" void* malloc(size_t sz) { void *(*libc_malloc)(size_t) = dlsym(RTLD_NEXT, "malloc"); printf("malloced %zu bytes\n", sz); return libc_malloc(sz); } int main() { char* x = malloc(100); return 0; }
This program will enter an infinite loop until it segfaults. This is because dlsym
calls malloc
.
$ clang mymalloc.c
$ ./a.out
Segmentation fault (core dumped)
For such cases, GNU's libc
used to provide special hooks such as __malloc_hook
...but they became deprecated. Now the best way is to MITM via the loader and LD_PRELOAD
.
// mtrace.c #include <stdio.h> #include <dlfcn.h> static void* (*real_malloc)(size_t) = nullptr; void *malloc(size_t size) { if(!real_malloc) { real_malloc = dlsym(RTLD_NEXT, "malloc"); } printf("malloc(%d) = ", size); return real_malloc(size); }
$ clang -shared -fPIC -D_GNU_SOURCE -o mtrace.so mtrace.c $ LD_PRELOAD=./mtrace.so ls malloc(472) = 0xaaab24e4b2a0 malloc(120) = 0xaaab24e4b480 malloc(1024) = 0xaaab24e4b500 malloc(5) = 0xaaab24e4b910 ... $
There is one further usage of weak symbols. When using STL templates, each relocatable receives a copy of instructions and symbols when instantiation is involved. As a result, two translation units using vector<int>
end up with the same symbols.
// c++foo.cc #include <vector> void foo() { auto v = std::vector<int>(); } |
// c++bar.cc #include <vector> void bar() { auto v = std::vector<int>(); } |
nm
confirms the duplicates in both object files.
$ clang -c -o c++foo.o c++foo.cc $ nm c++foo.o | grep -E 'vector|bar|foo' 0000000000000000 T foo() 0000000000000000 W std::vector<int, std::allocator<int> >::vector() 0000000000000000 W std::vector<int, std::allocator<int> >::~vector()
$ clang -c -o c++bar.o c++bar.cc $ nm c++bar.o | grep -E 'vector|bar|foo' 0000000000000000 T bar() 0000000000000000 W std::vector<int, std::allocator<int> >::vector() 0000000000000000 W std::vector<int, std::allocator<int> >::~vector()
When the linker sees several symbols it favors the "strong" one. However if only "weak" ones are available it picks up any of them without throwing an error. This behavior can be exposed in an example using template and -D
.
// weak_main.cc const char* foo(); const char* bar(); #include "stdio.h" int main() { printf("%s\n", foo()); printf("%s\n", bar()); } |
// c++foo.cc #define NAME "foo" #include "template.h" const char* foo() { Name |
// c++bar.cc #define NAME "bar" #include "template.h" const char* bar() { Name |
// template.h template<typename T> struct Name { T get() const { return T{NAME}; } }; |
At first sight, the program above should print to the console "foo"
and then "bar"
but it doesn't. Because of C++ One Definition Rule (ODR) all these symbols are marked as weak so a single one is picked, depending on the order the linker sees them.
$ clang++ -o main weak_main.cc c++foo.cc c++bar.cc $ ./main foo foo $ clang++ -o main weak_main.cc c++bar.cc c++foo.cc $ ./main bar bar
The original illustration of this process was found here.
The symbols list shows imports and exports names. That is enough for the linker to understand what an object provides and needs but that is not enough to relocate the relocatables. The linker needs the exact location of each symbols in an object. These are stored in relocation tables which readelf
can show us.
$ readelf -r mult.o
Relocation section '.rela.text' at offset 0x1d8 contains 5 entries:
Offset Info Type Sym. Value Sym. Name + Addend
000000000010 000800000137 R_AARCH64_ADR_GOT 0000000000000000 myConstant + 0
000000000014 000800000138 R_AARCH64_LD64_GO 0000000000000000 myConstant + 0
00000000001c 000900000113 R_AARCH64_ADR_PRE 0000000000000000 myVariable + 0
000000000020 00090000011d R_AARCH64_LDST32_ 0000000000000000 myVariable + 0
000000000024 000a0000011b R_AARCH64_CALL26 0000000000000000 add + 0
Relocation section '.rela.eh_frame' at offset 0x250 contains 1 entry:
Offset Info Type Sym. Value Sym. Name + Addend
00000000001c 000200000105 R_AARCH64_PREL32 0000000000000000 .text + 0
Every single usage of an imported variable/function is present in the relocation table. It provides everything the linker needs like the section to patch, the offset, the type of usage, and of course the symbol name.
So far we used examples using the C language which results in simple symbol names where function/variable results in a symbol of the same name. Things get more complicated when a language allows function overloading.
To illustrate mangling, instead of letting the driver detect the language, we can declare it ourselves and see what happens with the symbols table.
// sample.c void foo() {};
Let's first compile sample.c
as a C file (with -x c
) and then as a C++ file -x c++
.
$ clang -c -x c sample.c -o sample.o $ nm sample.o 0000000000000000 T foo $ clang -c -x c++ sample.c -o sample.o $ nm sample.o 0000000000000000 T _Z3foov
Thanks to mangling, C++ allows functions to have the same name. They get different symbol names thanks to the parameter types. Symbols avoid function name collision via a special encoding but name mangling can lead to linking issues.
// bar.h void bar(); |
|
// main.cpp #include "bar.h" int main() { bar(); return 0; } |
// bar.c void bar() {}; |
$ clang main.cpp bar.c -o main
/usr/bin/ld: /tmp/m-7f361c.o: in function `main':
main.cc:(.text+0x18): undefined reference to `bar()'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
The project won't link properly because the symbols for the function bar
do not match (main.cpp
was mangled as C++ but bar.c
was mangled as C).
$ nm main.o 0000000000000000 T main U _Z3barv $ nm bar.o 0000000000000000 T bar
There is a simple solution. Just use the name mangling C++ expect to name your functions and variables in your C++.
// bar.h
void _Z3barv();
|
|
// main.cpp #include "bar.h" int main() { bar(); return 0; } |
// bar.c
void _Z3barv() {};
|
It works, problem solved!
$ clang main.cpp bar.c -o main $
A more serious and realistic solution is to use a macro to let the compiler know that it should generate import symbol names without mangling them. This is done via extern "C"
.
// bar.h extern "C" { void bar(); } |
|
// main.cpp #include "bar.h" int main() { bar(); return 0; } |
// bar.c void bar() {}; |
Compilation works, the export/import symbol tables have no mismatch.
$ clang main.cpp bar.c -o main $ nm main.o 0000000000000000 T main U bar $ nm bar.o 0000000000000000 T bar
We have seen earlier how variables, constants, and functions end up in three sections text
, data
, and bss
but the compiler can operate at a lower granularity.
Instead of generating huge sections, the compiler can generate one section per symbol. This later allows the linker to pick only what is useful and reduce the size of the executable.
// sections.c int a = 0; int b = 0; int funcA() { return a;} int funcB() { return b;}
$ clang -c -o sections.o sections.c
$ readelf -S -W sections.o
There are 11 section headers:
Section Headers:
[Nr] Name Type
[ 0] NULL
[ 1] .strtab STRTAB
[ 2] .text PROGBITS
[ 3] .rela.text RELA
[ 4] .bss NOBITS
[ 5] .comment PROGBITS
[ 6] .note.GNU-stack PROGBITS
[ 7] .eh_frame PROGBITS
[ 8] .rela.eh_frame RELA
[ 9] .llvm_addrsig LOOS+0xfff4c03
[10] .symtab SYMTAB
|
$ clang -c -o sections.o sections.c \ -ffunction-sections -fdata-sections $ readelf -S -W sections.o There are 15 section headers: Section Headers: [Nr] Name Type [ 0] NULL [ 1] .strtab STRTAB [ 2] .text PROGBITS [ 3] .text.funcA PROGBITS [ 4] .rela.text.funcA RELA [ 5] .text.funcB PROGBITS [ 6] .rela.text.funcB RELA [ 7] .bss.a NOBITS [ 8] .bss.b NOBITS [ 9] .comment PROGBITS [10] .note.GNU-stack PROGBITS [11] .eh_frame PROGBITS [12] .rela.eh_frame RELA [13] .llvm_addrsig LOOS+0xfff4c03 [14] .symtab SYMTAB |
By far the most important flag to pass the compiler is the level of optimization to apply to the IR before generating the instructions. By default, no optimizations are performed. It shows, even with a program doing almost nothing.
// do_nothing.c void do_nothing() { } int main() { for(int i= 0 ; i < 1000000000 i++) do_nothing(); return 0; }
Let's build and measure how long it takes to do nothing.
$ clang do_nothing.c
$ time ./a.out
real 0m2.374s
user 0m2.104s
sys 0m0.015s
This program should have completed near instantly but because of the function call overhead, it took two seconds. Let's try again but this time, allowing optimization to occur.
$ clang do_nothing.c -O3 $ time ./a.out real 0m0.224s user 0m0.011s sys 0m0.014s
While some optimization focuses on runtime, others focus on code size. They are listed here.
Let's keep iterating with the previous program that does nothing. Compiler optimization -O3
is awesome but it has its limitations because it only operates at the translation unit level. Let's see what happens when the do_nothing
function is in a different source file.
// opt_main.c extern void do_nothing(); int main() { for(int i= 0 ; i <1000000000 ;i++) do_nothing(); } |
// do_nothing_tu.c void do_nothing() { } |
$ clang -O3 opt_main.c do_nothing_tu.c $ time ./a.out real 0m2.056s user 0m1.824s sys 0m0.018s
Even with optimization enabled, we are back to the poor performance of an un-optimized executable. Due to the siloed nature of translation unit processing, the compiler could not decide whether calls to do_nothing
should be pruned and generated a callsite anyway.
The solution to this problem would be to perform optimization not at the translation unit level but at the program level. Since only the linker has a vision of all components (and it can only see sections and symbols), this is seemingly not possible.
The trick to make it work is called "artisanal LTO". It consists in creating a super translation unit, containing all the source code of the program. We can do that with the pre-processor.
// all.c #include "do_nothing.c" #include "opt_main.c" |
// opt_main.c extern void do_nothing(); int main() { for(int i= 0 ; i <1000000000 ;i++) do_nothing(); } |
// do_nothing.c void do_nothing() { } |
Now able to see that do_nothing
is a no-op, the compiler is able to optimize it away.
$ clang all.c -O3
$ time ./a.out
real 0m0.163s
user 0m0.012s
sys 0m0.014s
Of course the bigger and complex the program, the less practical it is which led to LTO.
Thankfully, the "artisanal LTO" trick is no longer needed. Compilers can outputs extra information in the relocatables for the linker to use. Both GNU
's GCC
and LLVM
implement Link-Time Optimizations via -flto
flag but they do it differently.
GCC compiler implements LTO in a way that lets the linker fail gracefully if it does not support it. The program will still be linked but without link-time optimizations.
To this effect, GCC generates fat-objects which not only contains everything an .obj
should have but also GCC's intermediate representation (GIMPLE
bytecode).
// opt_main.c extern void do_nothing(); int main() { for(int i= 0 ; i <1000000000 ;i++) do_nothing(); } |
// do_nothing.c void do_nothing() { } |
$ gcc -c opt_main.c -o opt_main.o $ file opt_main.o main.o: ELF 64-bit LSB relocatable, ARM aarch64, version 1 (SYSV), not stripped $ gcc -c opt_main.c -o opt_main.o -flto $ file opt_main.o main.o: ELF 64-bit LSB relocatable, ARM aarch64, version 1 (SYSV), not stripped
$ gcc -flto opt_main.c do_nothing.c $ time ./a.out real 0m2.112s user 0m2.107s sys 0m0.004s $ gcc -O3 -flto opt_main.c do_nothing.c $ time ./a.out real 0m0.002s user 0m0.000s sys 0m0.002s
.o
extension. This is inconvenient because if the linker does not know how to handle bitcode, the compilation process will fail.
// opt_main.c extern void do_nothing(); int main() { for(int i= 0 ; i <1000000000 ;i++) do_nothing(); } |
// do_nothing.c void do_nothing() { } |
$ clang -c opt_main.c -o opt_main.o $ file opt_main.o main.o: ELF 64-bit LSB relocatable, ARM aarch64, version 1 (SYSV), not stripped $ clang -c opt_main.c -o opt_main.o -flto $ file opt_main.o hello.o: LLVM IR bitcode
$ clang -flto opt_main.c do_nothing.c $ time ./a.out real 0m2.112s user 0m2.107s sys 0m0.004s $ clang -O3 -flto opt_main.c do_nothing.c $ time ./a.out real 0m0.002s user 0m0.000s sys 0m0.002s
C++ keeps on evolving. There was C++98
, then C++03
, then C++11
, then C++14
, then C++17
, then C++20
, and now C++23
. The default dialect of the compiler keeps on evolving. Checkout your compiler documentation and use the flag -std
(e.g:-std=c++11
) to make sure you are using the proper one.
-std=c++11. // clang -std=gnu++11 // gcc
Likewise, C has been revised over the years. The standardized C from 1988 was updated by C89, C90, C95, C99, and lately C11. Flags are to be used to indicate which version is used.
-std=c99
The implementations of the C Standard Library, commonly called libc
, are fairly consistent. Switching between BSD's libc
, GNU's glibc
, or Android's bionic
should not yield surprises.
The STL is a different story. There are several implementations around and their implementations are always a source of discrepancies.
At the very least, try to use the STL of your toolkit (GNU's libstdc++
, LLVM's libc++
, and Microsoft STL
). If you are working on portable code to run on multiple OS, try to use the same STL everywhere. That means using LLVM's libc++
.
new
, new[]
, delete
, and delete[]
are not "built-in" C++ language. They actually come from the header <new>
. Furthermore, if you peek inside that header, you will see that new
is implemented with malloc
.
_LIBCXXABI_WEAK void * operator new(std::size_t size) _THROW_BAD_ALLOC { if (size == 0) size = 1; void* p; while ((p = ::malloc(size)) == 0) { std::new_handler nh = std::get_new_handler(); if (nh) nh(); else break; } return p; }
For debuggers to work, you need to compile with debugging information. This is done with -g
. Upon seeing this flag, the compiler will generate sections featuring DWARF sections (for ELF binaries, get it?).
// hello.cc #include <iostream> int main() { std::cout << "Hello World!"; return 0; }
$ clang++ hello.cc $ ll a.out -rwxrwxr-x 1 leaf leaf 9576 Mar 22 19:14 a.out* $ clang++ -g hello.cc $ ll a.out -rwxrwxr-x 1 leaf leaf 21912 Mar 22 19:15 a.out*
The size difference is significant but it is much more telling to compare with a real world, large application such as git
.
$ sudo apt-get install dh-autoreconf libcurl4-gnutls-dev libexpat1-dev gettext libz-dev libssl-dev
$ sudo apt-get install install-info
$ git clone git@github.com:git/git.git
$ cd git
$ make configure
$ ./configure --prefix=/usr
$ make all
$ ls -l --block-size=M git
-rwxrwxr-x 137 leaf leaf 18M Apr 2 18:04 git
git
's Makefile builds in debug mode by default. Let's remove the flag -g
and build again to see the difference.
$ git reset --hard
$ git clean -f -x
$ sed -i 's/-g //g' Makefile
$ make clean
$ make all
$ ls -l --block-size=M git
-rwxrwxr-x 137 leaf leaf 4M Apr 2 18:04 git
A binary without debug information is 3x smaller (-12 MiB)!
make
is doing, you can run it in verbose mode to see every commands it executes via parameter V=1
.
Compiling with debug sections logically increases the size of the output. The opposite of this operation is to strip
sections which are not useful. It can reduce further the size of an ELF file.
$ clang -c -g -o hello.o hello.c
$ ll hello.o
-rw-rw-r-- 1 leaf leaf 3256 Apr 3 01:21 hello.o
$ readelf -S -W hello.o
There are 22 section headers, starting at offset 0x738:
Section Headers:
[Nr] Name Type Address Off Size ES Flg Lk Inf Al
[ 0] NULL 0000000000000000 000000 000000 00 0 0 0
[ 1] .strtab STRTAB 0000000000000000 000611 000121 00 0 0 1
[ 2] .text PROGBITS 0000000000000000 000040 000034 00 AX 0 0 4
[ 3] .rela.text RELA 0000000000000000 000478 000048 18 I 21 2 8
[ 4] .rodata.str1.1 PROGBITS 0000000000000000 000074 00000e 01 AMS 0 0 1
[ 5] .debug_abbrev PROGBITS 0000000000000000 000082 000038 00 0 0 1
[ 6] .debug_info PROGBITS 0000000000000000 0000ba 000037 00 0 0 1
[ 7] .rela.debug_info RELA 0000000000000000 0004c0 000060 18 I 21 6 8
[ 8] .debug_str_offsets PROGBITS 0000000000000000 0000f1 00001c 00 0 0 1
[ 9] .rela.debug_str_offsets RELA 0000000000000000 000520 000078 18 I 21 8 8
[10] .debug_str PROGBITS 0000000000000000 00010d 000050 01 MS 0 0 1
[11] .debug_addr PROGBITS 0000000000000000 00015d 000010 00 0 0 1
[12] .rela.debug_addr RELA 0000000000000000 000598 000018 18 I 21 11 8
[13] .comment PROGBITS 0000000000000000 00016d 000026 01 MS 0 0 1
[14] .note.GNU-stack PROGBITS 0000000000000000 000193 000000 00 0 0 1
[15] .eh_frame PROGBITS 0000000000000000 000198 000030 00 A 0 0 8
[16] .rela.eh_frame RELA 0000000000000000 0005b0 000018 18 I 21 15 8
[17] .debug_line PROGBITS 0000000000000000 0001c8 00005f 00 0 0 1
[18] .rela.debug_line RELA 0000000000000000 0005c8 000048 18 I 21 17 8
[19] .debug_line_str PROGBITS 0000000000000000 000227 000022 01 MS 0 0 1
[20] .llvm_addrsig LOOS+0xfff4c03 0000000000000000 000610 000001 00 E 21 0 1
[21] .symtab SYMTAB 0000000000000000 000250 000228 18 1 21 8
Now let's strip the object.
$ strip hello.o
$ ll hello.o
-rw-rw-r-- 1 leaf leaf 816 Apr 3 01:22 hello.o
$ readelf -S -W hello.o
There are 8 section headers, starting at offset 0x130:
Section Headers:
[Nr] Name Type Address Off Size ES Flg Lk Inf Al
[ 0] NULL 0000000000000000 000000 000000 00 0 0 0
[ 1] .text PROGBITS 0000000000000000 000040 000034 00 AX 0 0 4
[ 2] .rodata.str1.1 PROGBITS 0000000000000000 000074 00000e 01 AMS 0 0 1
[ 3] .comment PROGBITS 0000000000000000 000082 000026 01 MS 0 0 1
[ 4] .note.GNU-stack PROGBITS 0000000000000000 0000a8 000000 00 0 0 1
[ 5] .eh_frame PROGBITS 0000000000000000 0000a8 000030 00 A 0 0 8
[ 6] .llvm_addrsig LOOS+0xfff4c03 0000000000000000 0000d8 000001 00 E 0 0 1
[ 7] .shstrtab STRTAB 0000000000000000 0000d9 000051 00 0 0 1
Note that in this example, we used strip
on an object file but since it works on ELF, it can be (and usually is) used on the linker output.