Some Ramainder from Previous Post:
The first 64 bytes of the file... It says everything and done :)
where the section table is, where the program header is how many there are, how many bytes each.
Actually here 64 ı used because of 64bit ELF file on 32bit it is a 52 when we are reading.
Here 52 some weird coming right ? This is why extern struct coming -> C_ABI compliance Now you know reason.. We are saying compiler dont change anything here dont add any padding. Padding issue is really dangerous if the compiler inserts even
one byte between fields, every offset calculation is wrong and you're
reading garbage. and now you know why ı am crying for 2 weeks.. ( align issue also is one reason )
How We MUST Read
First 16 bytes is coming to e_indent so ı will show here as e_indent
First as above ı say first part of elf e_indent[0..4] here we saw \x7FELF Okay it is ELF File
Reading e_indent[4] it is showing class byte → ELF32 or ELF64
Reading e_indent[5] is endian part 1 is Little Endian and 2 is Big Endian
Reading e_ident[6] is EI_VERSION and always 1 I dont know why always 1 writing here data always 1 on spec.
Reading e_indent[7] is EI_OSABI here also generally 0 = System V (most common)
Other part is padding data
Let is read with : Format-Hex .\limon -Offset 0x00 -Count 16
Offset Bytes Ascii
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
------ ----------------------------------------------- -----
0000000000000000 7F 45 4C 46 02 01 01 00 00 00 00 00 00 00 00 00 ▪ELF············
As you see here we read limon First 4 bytes 7F 45 4C 46 as ascii DEL ELF
4. byte is 02 is 64 bit 5. bit is 1 Little endian 6. bit is 1 (as always) and 7.bit is 1 System V
Quick note: on PE files(windows) this section is 2 step what ı mean ? first bytes is MZ when see MZ go to 0x3C why ? because here we have offset of PE\0\0
MZ is past capitality. normaly we need to check PE0\0. ELF files starting ascii DEL is almpost impossible on another file types to start with this code. So for ELF
is okay but MZ can be on starting point of any other file type. soooo 2 step is needed for PE files. Explanation some weird but on PE files part will be more clear dont wory :)
Already first 16 bytes ı have already.
Let is continue with another fields of ELF32Header or Elf64Header file
Let is write : Format-Hex .\limon -Offset 16 -Count 16
Offset Bytes Ascii
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
------ ----------------------------------------------- -----
0000000000000010 02 00 3E 00 01 00 00 00 00 DC 01 01 00 00 00 00 ▪ > ▪ Ü▪▪············
Address 0x10 address we have 02 it is e_type ET_EXEC
Address 0x12 address we have 3E it is e_machine 3E mean x86-64 (AMD64)
Address 0x14 address we have 01 it is e_version 01 mean ELF 1
Address 0x18-... address we have DC 01 01 it is e_entry DC 01 01 0x0101DCis address of start point of application exec
But be careful not 0xDC0101 reading little endian so 0x0101DC and how many byte we need to read here on 32 bit this part is u32 on 64bit u64
For another fields also ı will write this way but section and program table part ı will not go this way but logic is this Read here check from spec what it mean that is all
Okay let is continue for other parts of fields:
Let is write : Format-Hex .\limon -Offset 32 -Count 32 here all waiting part ı am getting
Offset Bytes Ascii
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
------ ----------------------------------------------- -----
0000000000000020 40 00 00 00 00 00 00 00 58 7A 3E 00 00 00 00 00 @···········
0000000000000030 00 00 00 00 40 00 38 00 09 00 40 00 16 00 14 00 ··@········
Offset Hex Field Value ──────────────────────────────────────────────────────────────────── 0x20 40 00 00 00 00 00 00 00 e_phoff 0x40 — program header table @ byte 64 0x28 58 7A 3E 00 00 00 00 00 e_shoff 0x3E7A58 — section header table offset 0x30 00 00 00 00 e_flags 0 (no arch flags) 0x34 40 00 e_ehsize 64 bytes — ELF header size 0x36 38 00 e_phentsize 56 bytes — per program header entry 0x38 09 00 e_phnum 9 segments 0x3A 40 00 e_shentsize 64 bytes — per section header entry 0x3C 16 00 e_shnum 22 sections 0x3E 14 00 e_shstrndx 20 — string table index
And some explanation is coming:
e_phoff / e_shoff — there are two different tables: Program Header Table (for runtime, the kernel reads this) and Section Header Table (for linking/debugging). These two fields give their byte offsets within the file. PHT starts immediately after the ELF header (0x40 = 64, the exact end of the header), while SHT is much far away then this (0x3E7A58 ≈ 4 MB).
e_flags — There are no x86-64 specific flags, so it's 0. Architectures like ARM can have things like ABI version, hard/soft float, etc. here.
e_ehsize — How many bytes is the ELF header itself? 64. This is a constant, always 64 in 64-bit ELF.
e_phentsize / e_phnum — Each entry in the PHT is 56 bytes, there are 9 in total. So the PHT = 9 × 56 = 504 bytes. The kernel reads these to determine which segment to load where and which are executable/readable/writable.
Note 1:
Here e_phentsize value is 38 and e_phnum adress is 38 e_phentsize is not offset value by chance came this way!!!
Note 2:
For e_phentsize / e_phnum 56 is not coming from this table e_phentsize value is 56 so we multipley by 56 and changing on 32bit and 64bit this is why get class value from file is important !!!
e_shentsize / e_shnum — Each entry in SHT is 64 bytes, with a total of 22 sections. Sections like .text, .data, .bss, .symtab, .rodata are listed here. The kernel doesn't look at these; it uses tools like ld / objdump / gdb.
e_shstrndx — The names of the 22 sections (strings like .text, .data) are stored in a separate section. This field answers the question "Which section index is the string table in?": The section at index 20 is .shstrtab.
Again when ı want to learn a name of section ı will go to 20. index. why 20 ? -> 0x14 is 20 on decimal
Last Note for this part:)
This offset issues can be some diffifult so ı wrote a small application for this showing table sizes and printing offsets of file so directly you can check this value of offset
For windows app is here and for linux here when you run app yu will see all size and offsets for this headers...
Output is this way:
elf.Elf32Header
SIZE : 52 bytes
ALIGN : 4
-----------------------------
e_ident offset=0x00 size=16
e_type offset=0x10 size=2
e_machine offset=0x12 size=2
e_version offset=0x14 size=4
e_entry offset=0x18 size=4
e_phoff offset=0x1C size=4
e_shoff offset=0x20 size=4
e_flags offset=0x24 size=4
e_ehsize offset=0x28 size=2
e_phentsize offset=0x2A size=2
e_phnum offset=0x2C size=2
e_shentsize offset=0x2E size=2
e_shnum offset=0x30 size=2
e_shstrndx offset=0x32 size=2
elf.Elf64Header SIZE : 64 bytes ALIGN : 8 ----------------------------- e_ident offset=0x00 size=16 e_type offset=0x10 size=2 e_machine offset=0x12 size=2 e_version offset=0x14 size=4 e_entry offset=0x18 size=8 e_phoff offset=0x20 size=8 e_shoff offset=0x28 size=8 e_flags offset=0x30 size=4 e_ehsize offset=0x34 size=2 e_phentsize offset=0x36 size=2 e_phnum offset=0x38 size=2 e_shentsize offset=0x3A size=2 e_shnum offset=0x3C size=2 e_shstrndx offset=0x3E size=2. . .
Sooo this way you can know from which offset how many byte and what you will get!
And ı am continue to write Biber also on me looking like that :
CLASS ELF64 DATA LittleEndian TYPE ET_EXEC (executable) MACHINE x86_64 Entry 0x0101DC00 P_OFFSET 0x40 (9 entries x 56 bytes = 504 bytes) S_OFFSET 0x3E7A58 (22 entries x 64 bytes = 1408 bytes) FLAGS 0x0
Program Table
Here already only what is what ı will explain a little faster ...
OKAY let is start:
Format-Hex .\limon -Offset 0x40 -Count 504
Offset Bytes Ascii 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F ------ ----------------------------------------------- ----- 0000000000000040 06 00 00 00 04 00 00 00 40 00 00 00 00 00 00 00 ▸seg0 PT_PHDR R 0000000000000050 40 00 00 01 00 00 00 00 40 00 00 01 00 00 00 00 0000000000000060 F8 01 00 00 00 00 00 00 F8 01 00 00 00 00 00 00 0000000000000070 08 00 00 00 00 00 00 00 01 00 00 00 04 00 00 00 ▸seg1 PT_LOAD R 0000000000000080 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00 0000000000000090 00 00 00 01 00 00 00 00 FC CB 01 00 00 00 00 00 00000000000000A0 FC CB 01 00 00 00 00 00 00 10 00 00 00 00 00 00 00000000000000B0 01 00 00 00 05 00 00 00 00 CC 01 00 00 00 00 00 ▸seg2 PT_LOAD R+X ← kod/entry 00000000000000C0 00 DC 01 01 00 00 00 00 00 DC 01 01 00 00 00 00 00000000000000D0 39 94 06 00 00 00 00 00 39 94 06 00 00 00 00 00 00000000000000E0 00 10 00 00 00 00 00 00 01 00 00 00 06 00 00 00 ← seg2 p_align=0x1000 | ▸seg3 p_type=PT_LOAD p_flags=R+W 00000000000000F0 40 60 08 00 00 00 00 00 40 80 08 01 00 00 00 00 0000000000000100 40 80 08 01 00 00 00 00 08 00 00 00 00 00 00 00 segment 3 p_paddr | p_filesz=0x8 ← .bss 0000000000000110 C0 0F 00 00 00 00 00 00 00 10 00 00 00 00 00 00 0000000000000120 01 00 00 00 06 00 00 00 48 60 08 00 00 00 00 00 ▸seg4 PT_LOAD R+W ← .bss 0000000000000130 48 90 08 01 00 00 00 00 48 90 08 01 00 00 00 00 0000000000000140 90 4D 00 00 00 00 00 00 B8 F0 00 00 00 00 00 00 0000000000000150 00 10 00 00 00 00 00 00 07 00 00 00 04 00 00 00 ▸seg5 PT_TLS R 0000000000000160 40 60 08 00 00 00 00 00 40 70 08 01 00 00 00 00 0000000000000170 40 70 08 01 00 00 00 00 00 00 00 00 00 00 00 00 0000000000000180 1C 00 04 00 00 00 00 00 08 00 00 00 00 00 00 00 0000000000000190 52 E5 74 64 04 00 00 00 40 60 08 00 00 00 00 00 ▸seg6 PT_GNU_RELRO R 00000000000001A0 40 80 08 01 00 00 00 00 40 80 08 01 00 00 00 00 00000000000001B0 08 00 00 00 00 00 00 00 C0 0F 00 00 00 00 00 00 00000000000001C0 01 00 00 00 00 00 00 00 50 E5 74 64 04 00 00 00 ▸seg7 PT_GNU_STACK NX 00000000000001D0 C0 84 01 00 00 00 00 00 C0 84 01 01 00 00 00 00 00000000000001E0 C0 84 01 01 00 00 00 00 8C 0A 00 00 00 00 00 00 00000000000001F0 8C 0A 00 00 00 00 00 00 04 00 00 00 00 00 00 00 0000000000000200 51 E5 74 64 06 00 00 00 00 00 00 00 00 00 00 00 ▸seg8 PT_GNU_PROPERTY R+W 0000000000000210 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0000000000000220 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00 0000000000000230 00 00 00 00 00 00 00 00
Every program header has these areas:
- p_type u32 4 byte — segment type
- p_flags u32 4 byte — Permissions (R/W/X) (1)
- p_offset u64 8 byte — Where is on file
- p_vaddr u64 8 byte — Where will be load
- p_paddr u64 8 byte — physical address
- p_filesz u64 8 byte — how many byte need to read
- p_memsz u64 8 byte — On memory how many area will hold
- p_align u64 8 byte — Align
(1)On part 1 actually ı said but again this permissions not file permission they are on memory area permissions!!
For example address0000000000000070 is segment1 : when this file is running kernel is thinking for this part
read 0x69439 bytes from offset 0x1CC00 of file and write to 0x0101DC00 address of memory with R+X permissions.
And at this point ı need to say p_memsiz >= p_filesize what is mean? it is .bss issue for example you have a array of [256]u8 here 4*256 byte but on binary this size only placeholder on runtime app will need this area so ı run time for memory we need to get this value soo memsize directly will be bigger than p_size Look at the segment 3
p_filesz is only 8 byte but p_memsize is 4032 byte it is coming from .bss
on Biber ı am reading this part hust like that:
CLASS ELF64 DATA LittleEndian TYPE ET_EXEC (executable) MACHINE x86_64 Entry 0x0101DC00 P_OFFSET 0x40 (9 entries x 56 bytes) S_OFFSET 0x3E7A58 (22 entries x 64 bytes) FLAGS 0x0 PROGRAM HEADER Type Flg VAddr PAddr FileSize MemSize -------------------------------------------------------------------------------------- PHDR R-- 0x0000000001000040 0x0000000001000040 0x00000001F8 0x00000001F8 LOAD R-- 0x0000000001000000 0x0000000001000000 0x000001CBFC 0x000001CBFC LOAD R-X 0x000000000101DC00 0x000000000101DC00 0x0000069439 0x0000069439 ← entry LOAD RW- 0x0000000001088040 0x0000000001088040 0x0000000008 0x0000000FC0 ← .bss LOAD RW- 0x0000000001089048 0x0000000001089048 0x0000004D90 0x000000F0B8 ← .bss TLS R-- 0x0000000001087040 0x0000000001087040 0x0000000000 0x000004001C ← thread local GNU_RELRO R-- 0x0000000001088040 0x0000000001088040 0x0000000008 0x0000000FC0 ← ro after load GNU_EH_FRAME R-- 0x00000000010184C0 0x00000000010184C0 0x0000000A8C 0x0000000A8C ← exception info GNU_STACK RW- 0x0000000000000000 0x0000000000000000 0x0000000000 0x0001000000 ← NX stack
Now let is check output:
P_OFFSET is 0x40 so program header starting from Here
Entry is 0x0101DC00 so it is starting point of our application here our code is running
what ı mean ? on this memory area must have execute right so ı am ceheking second load address is 0x000000000101DC00 and permission is R-X READ and EXECUTE as expected.
For this part probably is enought to explain how you can output is correct this way and of course read spec all the time!
And already time is some lete ı hope tomorrow ı will add part 3 See you soon :)
References
Everything above can be verified against these primary sources.
When in doubt, go to the spec — not a blog post.
It is only what ı understand can be wrong !
ELF
- System V ABI — official ELF specification
- linux/elf.h — struct definitions in the Linux kernel
- x86-64 System V ABI
- OSDev Wiki — ELF