TamgaOS (yula)
Project: github.com/hrasityilmaz/TamgaOs
GDT, protected mode, Zig vs C comparison, SSE triple fault debugging, and adding a serial port. These notes are the story of things working, then breaking, then working again.
Short summary: Kernel OK came through in both → added GDT → worked in C, triple fault in Zig → it was an SSE issue → found it with Biber → fixed with Assembly → added serial port so I never have to struggle like this again.
KERNEL OK — Starting in Two Languages
Both the Zig and C kernels were written with the same Multiboot2 header and entry address. Both kernels could boot and write to VGA. This part was relatively clean.
Zig Kernel
Entry: 0x00101000
Multiboot2 header: ✓
.text, .eh_frame_hdr, .eh_frame
Program header: 4 entries
KERNEL OK → VGA ✓
C Kernel
Entry: 0x00101000
Multiboot2 header: ✓
.multiboot, .text, .rodata, .eh_frame
Program header: 5 entries
KERNEL OK → VGA ✓
I inspected both binaries with Biber. The Multiboot2 magic values were exactly where they should be.
Biber -f zig-out/bin/kernel -dis 0x1000 32 00001000 D6 50 52 E8 00 00 00 00 18 00 00 00 12 AF AD 17 00001010 00 00 00 00 08 00 00 00 01 1B 03 3B 10 00 00 00 Biber -f zig-out/bin/c_kernel -dis 0x1000 32 00001000 D6 50 52 E8 00 00 00 00 18 00 00 00 12 AF AD 17 00001010 00 00 00 00 08 00 00 00 00 00 00 00 00 00 00 00
The ELF layout was different. While the C kernel kept .multiboot, .text, and .rodata in separate LOAD segments, the Zig kernel placed .eh_frame data into the first read-only LOAD segment, right after .multiboot. This difference turned out to be an important clue in the problem that followed.
SECTION HEADER
Nr Name Type Address Offset Size Flg
---------------------------------------------------------------
1 .multiboot PROGBITS 0x00100000 0x001000 0x18 A
2 .text PROGBITS 0x00101000 0x002000 0x33 AX
3 .rodata PROGBITS 0x00102000 0x003000 0x10 AMS
4 .eh_frame_hdr PROGBITS 0x00102010 0x003010 0x14 A
What is the GDT?
The BIOS always starts in Real Mode. In Real Mode you're running 16-bit, there's no ring system, no memory protection, and the maximum addressable space is 1MB. The CPU says "OK, let's switch to protected mode" — but how? What is each segment? The GDT exists to define all of that.
BIOS → Real Mode (16-bit) ↓ GDT is set up ↓ Protected Mode activates (32-bit) ↓ Kernel takes over
The GDT's primary job is to make the transition into protected mode possible, because it contains descriptors that define what each segment is. We can compare this to ELF: program headers and section headers also describe to the kernel which part of the binary is what — the logic is the same.
| Real Mode | Protected Mode |
|---|---|
| 16-bit | 32-bit (or 64-bit) |
| No memory protection | Segment descriptor protection |
| No ring system | Ring0 / Ring3 separation |
| Max 1MB | 4GB (32-bit) |
| GDT unnecessary | GDT required |
GDT Entry Anatomy
Each GDT entry is 8 bytes. What does it look like in RAM?
codesegment: base = 0x00000000 limit = 0xFFFFF access = 0x9A gran = 0xCF In RAM → FF FF 00 00 00 9A CF 00
The reason for this byte ordering comes from the historical structure of the x86 descriptor format — the limit and base are split across the entry. But what matters most are the access byte and the granularity byte.
Access Byte: 0x9A and 0x92
0x9A — Code Segment
0x9A in binary is 10011010.
| Bit 7 | Bit 6-5 | Bit 4 | Bit 3 | Bit 2 | Bit 1 | Bit 0 |
|---|---|---|---|---|---|---|
| 1 | 00 | 1 | 1 | 0 | 1 | 0 |
| P | DPL | S | Exec | Conform | Read | Access |
10011010 |||| |||| |||| |||+-- Accessed = 0 not yet accessed |||| ||+--- Readable = 1 readable |||| |+---- Conforming = 0 non-conforming |||| +----- Executable = 1 code segment |||+------- S = 1 Code/Data segment |++-------- DPL = 00 Ring0 = kernel only +---------- Present = 1 present in RAM → Present, Ring0, Code Segment, Readable
0x92 — Data Segment
0x92 in binary is 10010010.
10010010 |||| |||| |||| |||+-- Accessed = 0 |||| ||+--- Writable = 1 writable |||| |+---- Direction = 0 grows upward |||| +----- Executable = 0 data segment! |||+------- S = 1 |++-------- DPL = 00 +---------- Present = 1 → Present, Ring0, Data Segment, Writable
Summary: 0x9A → code segment (readable). 0x92 → data segment (writable). The CPU understands which entry is which through the selector — it reads from the descriptor's access byte.
About the ring system: Ring0 = kernel only. Ring3 = user applications. This is why you see so many syscalls in your applications — the user space to kernel space transition comes from here. When an application crashes, only that app dies; the kernel stays up. That's the benefit of this separation.
Granularity Byte: 0xCF
0xCF in binary is 11001111.
11001111 |||||||| ||||++++-- limit high = 1111 |||+------ AVL = 0 not special for us ||+------- L = 0 not 64-bit long mode |+-------- D/B = 1 32-bit segment +--------- G = 1 granularity = 4KB G=1 means: limit value is interpreted as 4KB blocks, not bytes → 0xFFFFF × 4KB = 4GB → entire 32-bit address space
| Bit | Name | Value | Meaning |
|---|---|---|---|
| G | Granularity | 1 | Limit in 4KB units (→ 4GB access) |
| D/B | Default/Big | 1 | 32-bit operands |
| L | Long mode | 0 | 64-bit long mode disabled |
| AVL | Available | 0 | Unused |
Selector with CS / DS / SS
How does the CPU know which GDT entry is code and which is data? From the selector.
; Load GDT and far jump to the code segment lgdt [ecx] ljmp 0x08:protected_mode ; 0x08 >> 3 = 1 → GDT[1] = 0x9A code entry ; Load data segment registers mov ax, 0x10 ; 0x10 >> 3 = 2 → GDT[2] = 0x92 data entry mov ds, ax mov es, ax mov fs, ax mov gs, ax mov ss, ax
| Register | Role |
|---|---|
| CS | The segment where code executes |
| DS | The segment for normal data reads |
| SS | Stack operations |
| EIP | Address of the next instruction |
| ESP | Top of the stack |
| CR0 | CPU control register (protected mode bit lives here) |
So the 0x08 far jump goes to the 1st GDT entry, and the 0x10 load goes to the 2nd entry. 0x9A and 0x92 are not "special CPU-reserved values" — they're simply values written into the access byte. The CPU finds the entry through the selector and reads what it is from the access byte.
Triple Fault — Sudden Reboot
Nightmare is starting
The GDT worked in the C kernel. In the Zig kernel, the machine reset immediately after calling gdt.init(). No error message, no panic screen. Just a reboot.
Without GDT initialization, the kernel booted normally. So the first suspect was the GDT setup itself. QEMU exception logging was enabled:
qemu-system-i386 -cdrom TamgaOS.iso -boot d -d int,cpu_reset
check_exception old: 0xffffffff new 0x6 ; #UD — Invalid Opcode check_exception old: 0xffffffff new 0xd ; #GP — General Protection Fault check_exception old: 0xd new 0xd ; #DF — Double Fault (no IDT) Triple fault IP=0018:00101013
Without an IDT set up, the CPU couldn't handle the exception, so it chain-reacted into a triple fault and reset. But the interesting part was the first exception: #UD — Invalid Opcode. The CPU wasn't complaining about the segment descriptors. It was trying to execute an instruction it didn't understand.
This part is intentionally short — I don't want to relive it. And I still don't know why I didn't think to send logs to the QEMU console earlier... (probably because I was completely exhausted) Anyway solution is coming now :) What was the problem ?
The Real Problem: An SSE Instruction
The faulting address was 0x00101013. Biber was used to inspect the .text section:
.text Address = 0x00101000 .text Offset = 0x002000
0x00101013 - 0x00101000 + 0x002000
= 0x2013
so we need to look at offset 0x2013 inside the binary.
00002010 55 89 E5 0F 28 05 00 20 10 00 ... 0F 28 → movaps xmm0, [0x102000] ← SSE instruction!
The GDT code was perfectly fine. The problem was the Zig compiler emitting SSE instructions while initializing GDT structs. At that point during boot:
| State | Value |
|---|---|
| Protected mode | Active ✓ |
| SSE enabled? | No ✗ |
| CR0/CR4 configured for SSE? | No ✗ |
When the CPU saw movaps xmm0, ... it threw #UD. With no IDT, it escalated to #GP, then to #DF, and finally to a triple fault reset.
Why did it work in C? The C compiler didn't emit SSE instructions. Zig, on the other hand, applied SIMD optimizations during struct initialization. Same logic, different output.
for temporary fix: zig build part ı disabled SSE flags... but later will come real fix :)
Debugging Workflow with Biber
Biber was originally written for learning binary formats. But this time it was used to find a bug in the kernel itself.
IP=0018:001010130x20130F 28 → movaps SSE instruction identifiedFix: SSE-related CPU features were disabled in the Zig build configuration. After a rebuild, the disassembly changed:
Before: 00002010 55 89 E5 0F 28 05 ... ← movaps (SSE) After: 00002010 55 89 E5 B8 00 70 10 00 ← no SSE GDT loading code: lgdt [ecx] ljmp 0x08:0x00101074 mov ax, 0x10 mov ds, ax ; and the rest...
And the kernel booted.
Zig vs C — ELF Layout Comparison
Both kernels boot with the same Multiboot2 header and entry address, but their ELF layouts differ. Biber output:
PROGRAM HEADER Type Flg VAddr FileSize LOAD R-- 0x00100000 0x18 ← .multiboot LOAD R-X 0x00101000 0x33 ← .text LOAD R-- 0x00102000 0x5C ← .rodata GNU_EH_FRAME R-- 0x00102010 0x14 GNU_STACK RW- 0x00000000 0x00
PROGRAM HEADER Type Flg VAddr FileSize LOAD R-- 0x00100000 0x64 ← .multiboot + .eh_frame_hdr + .eh_frame LOAD R-X 0x00101000 0x63 ← .text (larger) GNU_EH_FRAME R-- 0x00100018 0x14 GNU_STACK RW- 0x00000000 0x00
While the C kernel keeps .multiboot, .text, and .rodata in separate LOAD segments, the Zig kernel places .eh_frame data into the first read-only LOAD segment. That's why the Zig .text section is larger and the program header count is different.
Important note: There was no need to disable SSE features on the C kernel side. The C compiler simply didn't emit SSE instructions. The problem was entirely caused by Zig's struct initialization optimizations.
GDT with Assembly — A Permanent Fix
To make sure the compiler couldn't sneak SSE behind my back, I decided to write the critical GDT section directly in assembly. No matter what the compiler does, this code won't change.
; Load GDT pointer lgdt [gdt_descriptor] ; Switch to protected mode (set CR0 PE bit) mov eax, cr0 or eax, 1 mov cr0, eax ; Far jump to code segment — flush pipeline jmp 0x08:protected_mode_entry protected_mode_entry: ; Load data segment registers mov ax, 0x10 mov ds, ax mov es, ax mov fs, ax mov gs, ax mov ss, ax
When using assembly, the compiler can't interfere and can't generate SSE instructions. The GDT code executes exactly what I wrote. This is especially critical in the early stages of boot code — you can't execute SSE instructions before SSE is enabled.
// Zig inline asm uses LLVM/AT&T constraint syntax asm volatile ("lgdt (%[ptr])" : : [ptr] "r" (&gdt_descriptor), ); // A separate asm block for the far jump asm volatile ( "ljmpl $0x08, $1f\n" "1:\n" "movw $0x10, %%ax\n" "movw %%ax, %%ds\n" "movw %%ax, %%es\n" "movw %%ax, %%fs\n" "movw %%ax, %%gs\n" "movw %%ax, %%ss\n" ::: "ax", "memory" );
Why a Serial Port?
Still this serial port issue ı am trying to implement but probably soon will be okay :)
IMPORTANT 1: Here ı am putting 16550 datasheet to understand logic of serial port. Specially check page 9 figure and later register explanations otherwise can be difficult to understand !
IMPORTANT 2: And https://wiki.osdev.org/Serial_Ports also so importtant page check here also
Actually using Biber for my kernel development was really amazing for me ı felt really good when it is working but for shorten to debug issue ı decided to add serial :)
Specially Biber not converting hex to instructions for this all day lost ı need to add this functionalty to Biber
I spent an entire day on the triple fault. QEMU logs, address translation with Biber, disassembly... I needed an earlier debug channel so I'd never have to go through all that again.
To write to VGA, the kernel needs to have booted, the GDT needs to be working, and you need to have transitioned into protected mode. A serial port can be initialized at a much earlier stage. When something breaks, I at least want to know what happened up to that point.
Simple logic: serial is active → write something → you see it in QEMU. If there's a problem in the GDT, you can write "GDT init started" before the crash even happens.
COM Ports and the BIOS Data Area
| COM Port | IO Port |
|---|---|
| COM1 | 0x3F8 |
| COM2 | 0x2F8 |
| COM3 | 0x3E8 |
| COM4 | 0x2E8 |
| COM5 | 0x5F8 |
| COM6 | 0x4F8 |
| COM7 | 0x5E8 |
| COM8 | 0x4E8 |
These values come from the BIOS Data Area. IO port addresses for COM1–COM4 are stored at address 0x0400 (one word each; zero if not present).
Address Size Contents 0x0400 8 COM1-COM4 IO ports (1 word each) 0x0408 6 LPT1-LPT3 parallel ports 0x040E 2 EBDA base address >> 4
IER and LCR Registers
To understand serial port registers, you need to know what the DLAB bit does. The same offset maps to a different register depending on the DLAB value.
Base = COM1 = 0x3F8 Offset DLAB=0 DLAB=1 ------ ------------------ ------------------ +0 RBR / THR DLL (Divisor Low) +1 IER (Int. Enable) DLM (Divisor High) +2 IIR/FCR IIR/FCR +3 LCR LCR +4 MCR MCR +5 LSR LSR +6 MSR MSR +7 SCR SCR
IER — Interrupt Enable Register
[7]DMA TX | [6]DMA RX | [5]0 | [4]0 | [3]Modem Status | [2]Line Status | [1]THR Empty | [0]Data Ready Set all bits to 0 → disable all interrupts. outb(COM1 + 1, 0x00); ← writes to IER since DLAB=0
LCR — Line Control Register
[7]DLAB | [6]Set Break | [5]Force Parity | [4]Even Parity | [3]Parity En | [2]Stop Bits | [1:0]Word Len Bit 7 (DLAB) = 1 → allows writing to DLL/DLM (baud rate setup) outb(COM1 + 3, 0x80); ← 10000000b → DLAB ON Baud rate divisor calculation: DLL=0x01 → 115200 / 1 = 115200 baud DLL=0x02 → 115200 / 2 = 57600 baud DLL=0x03 → 115200 / 3 = 38400 baud Then turn DLAB off and configure 8N1: outb(COM1 + 3, 0x03); ← DLAB OFF + 8bit, no parity, 1 stop bit
Serial Init Code
const COM1: u16 = 0x3F8; pub fn init() void { outb(COM1 + 1, 0x00); // IER: disable all interrupts outb(COM1 + 3, 0x80); // LCR: DLAB ON (for baud rate setup) outb(COM1 + 0, 0x01); // DLL: divisor low → 115200 baud outb(COM1 + 1, 0x00); // DLM: divisor high → 0 outb(COM1 + 3, 0x03); // LCR: DLAB OFF + 8bit, no parity, 1 stop outb(COM1 + 2, 0xC7); // FCR: FIFO enable, clear, 14-byte threshold outb(COM1 + 4, 0x0B); // MCR: DTR + RTS + OUT2 (IRQ enable) }
Port 0x3F8 → 1016 → greater than 255, so it needs to be sent via the DX register. The outb function handles this with the N{dx} constraint. Here N important. if value is smaller than 256 use directly otherwise use from dx it making guarantee!
fn outb(port: u16, value: u8) void { asm volatile ("outb %[val], %[port]" : : [val] "{al}" (value), [port] "N{dx}" (port), ); } // Why N{dx}? // Port address 0-255 → can be written as a direct immediate // Port address 255+ → must be read from the DX register // The "N" constraint handles this optimization automatically
AT&T vs Intel Syntax
For seperate assembly part ı used NASM to compile I mean on this files ı used INTEL syntax...
Zig inline assembly uses LLVM-style AT&T constraint syntax. GCC and Clang also traditionally use AT&T syntax. To avoid mixing them up:
| AT&T | Intel | |
|---|---|---|
| Operand order | src, dst | dst, src |
| Register prefix | %eax | eax |
| Immediate prefix | $5 | 5 |
| Memory | (%eax) | [eax] |
Intel: out dx, al ; destination first AT&T: outb %al, %dx ; source first ; Port greater than 255 → DX required: mov $0x3F8, %dx outb %al, %dx ; Port up to 255 → immediate can be written directly: outb %al, $0x80
Summary
It was a long road. Starting with a kernel booting in two languages, we ended up with a kernel armed with a serial port and a GDT fixed in assembly.
IP=0018:00101013 → #UD Invalid Opcode0F 28 → movaps SSE foundI wanted to write my kernel so without binary formats knowlege it came to impossible soo first ı wrote Biber to learn binary formats. But this time I used Biber to find a bug in my own kernel. That's a different kind of satisfaction.
See you on my next post :) -- I really need to read so many docs it is only starting --
References
When in doubt, go to the spec — not a blog post.
It is only what ı understand can be wrong !