PC Boot Process and Boot Block

CS 301 Lecture, Dr. Lawlor

How a PC boots

The boot sequence on an IBM PC is initiated by the BIOS (Basic Input/Output System) stored in the computer's ROM (Read-Only Memory). 

Actually, everything in that sentence is partially a lie:
Er, so back to the the booting process. 
  1. You push the power button.  This grounds the green wire on your PC power supply, causing it to turn on.
  2. Power flows into the CPU.
  3. The CPU is wired to begin executing code out of its own internal ROM (or flash memory).  This is called "microcode", and is set up at the CPU factory.  The CPU initializes itself (e.g., clears out its registers and cache, and puts itself into a known good state), and performs a self-test.
  4. Once the CPU self-test passes, the CPU jumps to a known location in memory, which is hardwired on your motherboard to point to your motherboard's ROM (er, flash memory).
  5. Your motherboard's ROM contains a piece of software called the "BIOS" that has just one purpose: to load up the *real* operating system.  This is harder than it sounds, because the operating system might be stored on:
  6. Once the BIOS finds a bootable disk, it loads up the "boot sector" (a disk sector, or disk block, of just 512 bytes) and executes it in 16-bit mode.  Why?  Because the original 8086 IBM PC back in 1981 executed the boot sector in 16-bit mode, that's why.  It was good enough for 1981, why isn't it good enough today?  Huh?
In a perfect world, the boot sector would actually be the kernel (more on that in a minute) of the operating system.  Sadly:
The standard thing to do from inside a boot sector is to load up the *real* OS loader ("bootloader") from disk.  The first 64 sectors of the disk are reserved for this, so the real bootloader can be up to 32K.  Known bootloaders include:
The bootloader then loads up the real OS.  So the boot sequence is overall the following amazing cascade:

Writing a Boot Block

A PC Master Boot Sector is just 512 bytes of machine code that the BIOS loads at bootup.  It's basically unchanged since 1981 with the original IBM PC, although there is a new different standard called EFI that is just now catching on.  With the original interface:
Here's an example:
BITS 16 ; everything here is 16 bit code

; Code gets loaded by the PC BIOS into address 0x7C00 and executed.
mov al,'H'
mov ah,0x0e ; print command
int 0x10 ; talk to video card

mov al,'i'
mov ah,0x0e ; print command
int 0x10 ; talk to video card

hang:
jmp hang

; Pad data out to magic boot sector identifier
times 512-2-($-$$) db 0
db 0x55
db 0xaa
You compile your little boot block with:
    nasm -f bin -o boot.bin yourcode.S
This should make a 512 byte file full of machine code.  If you point a virtual machine emulator at this as a tiny hard drive image, it should boot and run! 
    qemu-system-x86_64 boot.bin
On a Linux machine, you can also copy this onto a flash drive or floppy using "dd":
    sudo dd if=boot.bin of=/dev/sdX
This is scary, but it will actually boot any machine using pre-EFI BIOS!

Interfacing with the BIOS

To get work done from a boot sector, you can either find and command the hardware directly, or call BIOS interrupts, which work a lot like Linux system calls. Ralph Brown's Interrupt List (indexed by interrupt number) is the definitive reference for all BIOS (as well as MS-DOS) interrupt functions. 

For example, interrupt 10 with ah==0x0e is "output a character to the screen".  So above, I printed "H" using:
	mov al,'H'
mov ah,0x0e ; print command
int 0x10 ; talk to video card
Interrupt 10: Video I/O, such as printing characters with ah=0x0e, int 0x10.
Interrupt 13: Disk I/O, such as ah=0x02, int 0x13 to read data from disk.
Interrupt 16: Keyboard settings, such as reading characters with ah=0x00, int 0x16.

To get pretty graphics onto the screen, you first switch the hardware into graphics mode, typically VGA mode 13h:
	mov ah, 0         ; set video mode 13h - 320x200 
mov al, 13h
int 10h
You can then use direct memory access to segment 0xA000 (see below) to draw pixels onscreen.

Segmented Memory, and Memory-Mapped I/O

In 16-bit mode, pointers actually have two parts: the "segment" is the general area of memory, and the "offset" is the location inside that segment.  Both segment and offset are 16 bit values, so it looks like you could access 32 bits worth of memory, but for some reason the CPU combines segment and offset like this:
   actual address = (segment<<4) + offset;

This means segment:offset 0x0000:0x1230 is the same location as address 0x0123:0x0000.

For example, the VGA text mode display starts at segment 0xB800.  I can print "Hi!" at the top left corner of the screen by directly modifying the data there:
    mov ax,0xB800 ; this is the segment where VGA text mode data is stored
    mov es,ax ; can only mov into es from ax (why?!)
    mov BYTE [es:0x0000],'H' ; shows up at top left corner of screen
    mov BYTE [es:0x0002],'i'
    mov BYTE [es:0x0004],'!'
"es" is the "Extra Segment" register.  You can only set it from ax.  "es:0x0002" means use segment es, and offset 2.  In text mode, each character is at an even address, and the color and font of that character is stored in the corresponding odd address.