Implementation of an Operating System - Week 7
Hello all !
This is the seventh blog article of our blog series about implementing an Operating System. In this week I will explain you about virtual Memory and Paging.
Paging
Paging translates these linear addresses onto the physical address space, and determines access rights and how the memory should be cached.
Why Paging?
Paging is the most common technique used in x86 to enable virtual memory. Virtual memory through paging means that each process will get the impression that the available memory range is 0x00000000–0xFFFFFFFF even though the actual size of the memory might be much less. It also means that when a process addresses a byte of memory it will use a virtual (linear) address instead of physical one.
Paging in x86
Paging in x86 consists of a page directory (PDT) that can contain references to 1024 page tables (PT), each of which can point to 1024 sections of physical memory called page frames (PF). Each page frame is 4096 byte large. In a virtual (linear) address, the highest 10 bits specifies the offset of a page directory entry (PDE) in the current PDT, the next 10 bits the offset of a page table entry (PTE) within the page table pointed to by that PDE. The lowest 12 bits in the address is the offset within the page frame to be addressed.
Enabling Paging
Create a file called paging_enable.s and include following assembly code to it.
global load_page_directory
global enable_paging
global boot_page_directory
PAGING_PRESENT equ 1b
PAGING_WRITABLE equ 10b
PAGING_USER_ACCESSIBLE equ 100b
PAGING_SIZE_4MB equ 10000000b
; identity map 0x00000000 - 0x00400000 (first 4MB) which includes kernel
; and paging data structures
section .data
align 4096
boot_page_directory:
pde_frame_addr equ 0x0
dd (pde_frame_addr & 0xfff00000) + (PAGING_PRESENT | PAGING_WRITABLE | PAGING_SIZE_4MB)
times 0x3ff dd 0 ; allocate remaining page directory entries
; align 4096
; boot_page_table:
; %assign frame_addr 0
; %rep 0x300
; dd frame_addr | (PAGING_PRESENT | PAGING_WRITABLE | PAGING_USER_ACCESSIBLE)
; %assign frame_addr frame_addr+0x1000
; %endrep
; times 0x100 dd 0
section .text
load_page_directory: ; put &boot_page_directory in high 20 bits of cr3 register
mov eax, [esp+4]
mov ebx, cr3
and ebx, 0xfff ; zero out existing 20 high bits
and eax, 0xfffff000
or ebx, eax
mov cr3, ebx
ret
enable_paging:
; enable 4MB paging
mov eax, cr4
or eax, 0x10
mov cr4, eax
; enable paging (PG bit)
mov eax, cr0
or eax, 0x80000001 ; set PE (bit 0) and PG (bit 31)
mov cr0, eax
ret
Paging and the Kernel
This section describes how paging affects the OS kernel. We encourage you to run your OS using identity paging before trying to implement a more advanced paging setup, since it can be hard to debug a malfunctioning page table that is set up via assembly code.
Higher-half Linker Script
We can modify the first linker script to implement this:
ENTRY(loader) /* the name of the entry label */
kernel_start = .;
SECTIONS {
. = 0x00100000; /* the code should be loaded at 1 MB */
.text ALIGN (0x1000) : /* align at 4 KB */
{
*(.text) /* all text sections from all files */
}
.rodata ALIGN (0x1000) : /* align at 4 KB */
{
*(.rodata*) /* all read-only data sections from all files */
}
.data ALIGN (0x1000) : /* align at 4 KB */
{
*(.data) /* all data sections from all files */
}
.bss ALIGN (0x1000) : /* align at 4 KB */
{
*(COMMON) /* all COMMON sections from all files */
*(.bss) /* all bss sections from all files */
}
}
kernel_end = .;
Entering the Higher Half
When GRUB jumps to the kernel code, there is no paging table. Therefore, all references to 0xC0100000 + X won’t be mapped to the correct physical address, and will therefore cause a general protection exception (GPE) at the very best, otherwise (if the computer has more than 3 GB of memory) the computer will just crash.
Therefore, assembly code that doesn’t use relative jumps or relative memory addressing must be used to do the following:
- Set up a page table.
- Add identity mapping for the first 4 MB of the virtual address space.
- Add an entry for 0xC0100000 that maps to 0x0010000
- If you skip the identity mapping for the first 4 MB, the CPU would generate a page fault immediately after paging was enabled when trying to fetch j next instruction from memory. After the table has been created, an jump can be done to a label to make eip point to a virtual address in the higher half:
; assembly code executing at around 0x00100000
; enable paging for both actual location of kernel
; and its higher-half virtual location lea ebx, [higher_half] ; load the address of the label in ebx
jmp ebx ; jump to the label higher_half:
; code here executes in the higher half kernel
; eip is larger than 0xC0000000
; can continue kernel initialisation, calling C code, etc.
The register eip will now point to a memory location somewhere right after 0xC0100000 — all the code can now execute as if it were located at 0xC0100000, the higher-half. The entry mapping of the first 4 MB of virtual memory to the first 4 MB of physical memory can now be removed from the page table and its corresponding entry in the TLB invalidated with invlpg [0].
Running in the Higher Half
There are a few more details you must deal with when using a higher-half kernel. We must be careful when using memory-mapped I/O that uses specific memory locations. For example, the frame buffer is located at 0x000B8000, but since there is no entry in the page table for the address 0x000B8000 any longer, the address 0xC00B8000 must be used, since the virtual address 0xC0000000 maps to the physical address 0x00000000.
Any explicit references to addresses within the multiboot structure needs to be changed to reflect the new virtual addresses as well.
Mapping 4 MB pages for the kernel is simple, but wastes memory . Creating a higher-half kernel mapped in as 4 KB pages saves memory but is harder to set up. Memory for the page directory and one page table can be reserved in the .data section, but one needs to configure the mappings from virtual to physical addresses at run-time. The size of the kernel can be determined by exporting labels from the linker script [37], which we’ll need to do later anyway when writing the page frame allocator .
Virtual Memory Through Paging
Paging enables two things that are good for virtual memory. First, it allows for fine-grained access control to memory. You can mark pages as read-only, read-write, only for PL0 etc. Second, it creates the illusion of contiguous memory. User mode processes, and the kernel, can access memory as if it were contiguous, and the contiguous memory can be extended without the need to move data around in memory. We can also allow the user mode programs access to all memory below 3 GB, but unless they actually use it, we don’t have to assign page frames to the pages. This allows processes to have code located near 0x00000000 and the stack at just below 0xC0000000, and still not require more than two actual pages.
Thank you for reading and see you soon with another article..