A peek at ELF
Having learnt an object file has .text
section for code, .data
section for data, .bss
section for uninitialized data, and so on and so forth, it’s still unclear how it’s mapped into a real ELF file. How an elf file can represent these informations? Through objdump
, we may peek into the content of these sections.
objdump -s obj/boot/boot.out
The result looks like
obj/boot/boot.out: file format elf32-i386
Contents of section .text:
7c00 fafc31c0 8ed88ec0 8ed0e464 a80275fa ..1........d..u.
7c10 b0d1e664 e464a802 75fab0df e6600f01 ...d.d..u....`..
...
Contents of section .eh_frame:
7d80 14000000 00000000 017a5200 017c0801 .........zR..|..
7d90 1b0c0404 88010000 1c000000 1c000000 ................
...
It’s misleading for a beginner to look at this directly, because from output it seems that the main contents are interleaved with those headers; however, they are not. Now I’m doing a little mapping how the binary can be interpreted into that way. In fact, the headers locates at the beginning and the end of file; all the data and code in different sections or segments are bunched together in the middle.
Tools
readelf
(greadelf
for mac, installed bybrew install binutils
)objdump
(gobjdump
for mac)- wxHexEditor
File layout
ELF header |
program header |
... |
program header |
code & data |
section header |
... |
section header |
Example
Take the object file boot.out for example. You may use wxHexEditor to peek its raw hex like this.
readelf
does a little interpretation on these hex’s.
readelf --all boot.out
The result looks like
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
...
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
...
ELF header
The first part of the file is ELF header
. The most notable characteristic is the magic number 7f 45 4c 46
, where 45 4c 46
interprets as ELF
in ASCII.
Actually this is the e_ident
field defined by struct ElfN_Ehdr. For more detail, please refer to Wikipedia.
Program header and section header
The data and code part of a program can be divided into pieces. In a linker
’s view, it’s divided into sections
, and the the meta-data of these sections are stored in section header
table (from the figure above we can see it’s located at the end of the file). In kernel
’s (executor) view, data and code can be divide into segments
, and the meta-data of them are stored in program header
table.
Section header string table
Well, its name is a bit long. Actually it is a section
of the file, and it is one of the sources how objdump
know the names of each section. First let’s find it in the raw hex file.
readelf --all boot.out
and look at this
ELF Header:
...
Start of section headers: 4760 (bytes into file)
...
Size of section headers: 40 (bytes)
...
Section header string table index: 6
...
It says the start of section headers is 4760, and using this info, readelf
can further interpret the section headers for us (continue on the former result)
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
...
[ 6] .shstrtab STRTAB 00000000 001254 000043 00 0 0 1
...
Actually we may find the .shstrtab
entry ourselves. As from the ELF header we know that the start of section header is 4760, and that size of them is 40, and that the index of .shstrtab
is 6, we can calculate the start of .shstrtab
is $4760 + 6 \times 40 = 5000$.
It’s around there. $[5000, 5040)$ of boot.out
maps to struct Elf32_Shdr
(ref).
typedef struct {
uint32_t sh_name;
uint32_t sh_type;
uint64_t sh_flags;
Elf64_Addr sh_addr;
Elf64_Off sh_offset;
uint64_t sh_size;
uint32_t sh_link;
uint32_t sh_info;
uint64_t sh_addralign;
uint64_t sh_entsize;
} Elf64_Shdr;
The bytes at 5000 is 0x11, and it should be interpreted into its name, i.e. .shstrtab
. How can? Official explanation is
sh_name This member specifies the name of the section. Its value is an index into the section header string table section, giving the location of a null-terminated string.
So there’s a section .shstrtab
that records all the names of sections. Where is it? Clue is given by this table entry. Let’s look at the interpreted version by elfread
.
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
...
[ 6] .shstrtab STRTAB 00000000 001254 000043 00 0 0 1
...
Offset is 0x1254, size is 0x43, so we will look into [0x1254, 0x1297).
From the right, we can see that we are finding the right place. Then what does 0x11
mean? This index is a bit confusing. It’s just the index in that char sequence. $0x1254 + 0x11 = 0x1265$, so let’s look at the string starts at 0x1265.
That’s correct, .shstrtab
, and that is what we are looking for.
Conclusion
This post looks into some aspects of ELF file format. An example is given to illustrate how objdump -s
may know the locations and names of the sections from it.
References
- https://en.wikipedia.org/wiki/Executable_and_Linkable_Format
- http://man7.org/linux/man-pages/man5/elf.5.html
- http://www.opensecuritytraining.info/LifeOfBinaries.html