Build your own OS (week 5)

10 min readAug 20, 2021

Hello everyone!

Welcome to the fifth installment of my OS implementation series, in which I show you how to create your own operating system from the ground up.

We successfully integrated outputs into our operating systems (article 3), and now it’s time to incorporate inputs. We’ll need to know what an interrupt is and how the operating system handles them because we’ll be using them in today’s work. The goal for today is to integrate interrupts into our operating systems so that they can accept keyboard inputs.

Interrupt Handlers

Interrupts are managed via the Interrupt Descriptor Table (IDT). The IDT assigns a handler to each interrupt. The interrupts are numbered (0–255), and the interrupt I handler is defined at the table’s ith position. There are three different types of interrupt handlers:

Task handler
Interrupt handler
Trap handler

The task handlers will not be discussed because they are specific to the Intel version of x86. An interrupt handler differs from a trap handler in that it suppresses interruptions, thus you can’t get an interrupt while handling one. We’ll learn how to use trap handlers and manually deactivate interruptions in this series of articles.

Creating an Entry in the IDT

The IDT entry for an interrupt handler is 64 bits long. As shown in the diagram below,

the top 32 bits are shown:

Bit:    |31       16|15|14 13|12|11|10 9 8|7 6 5|4 3 2 1 0|
Content:|offset high| P| DPL | 0| D| 1 1 0|0 0 0| reserved|

The lowest 32 bits are shown in the figure below:

Bit:     |   31        16   | 15       0 |
Content: | segment selector | offset low |

The table below has a description for each name:

--------------------------------------------------------------------
Name        |    Description
--------------------------------------------------------------------offset high |    The 16 highest bits of the 32 bit address in the       
            |    segment. offset low |    The 16 lowest bits of the 32 bits address in the 
            |    segment.        p    |    If the handler is present in memory or not 
            |    (1 = present, 0 = not present).       DPL   |    Descriptor Privilige Level, the privilege level the 
            |    handler can be called from (0, 1, 2, 3).        D    |    Size of gate, (1 = 32 bits, 0 = 16 bits). segment 
            |    selector The offset in the GDT.       r    |    Reserved.

The offset is a pointer to a code (preferably an assembly code label). For example, the following two bytes could be used to create an entry for a handler whose code starts at 0xDEADBEEF and runs on privilege level 0 (using the same code segment selector as the kernel):

0xDEAD8E00

0x0008BEEF

The following code might be used to register the above example as an interrupt 0 (divide-by-zero) handler if the IDT is represented as an unsigned integer idt[512]:

idt[0] = 0xDEAD8E00

idt[1] = 0x0008BEEF

We propose using packed structures instead of bytes (or unsigned numbers) to make the code more intelligible, as outlined in the second part of this series on “Implement using C.”

Handling an Interrupt

When an interrupt occurs, the CPU will write some interrupt data to the stack, then check for and hop to the appropriate interrupt handler in the IDT. The stack will look like this at the time of the interrupt:

[esp + 12]    eflags
[esp + 8]     cs 
[esp + 4]     eip 
[esp]         error code?

Because not all interruptions generate an error code, the error code is preceded by a question mark. 8, 10, 11, 12, 13, 14, and 17 are the particular CPU interrupts that cause an error code to be added to the stack. The interrupt handler can utilize the error code to find out more about what went wrong. It’s also worth noting that the interrupt number isn’t moved to the top of the stack. We can only tell what interrupt has happened by looking at the code that is running — if the handler for interrupt 17 is running, then interrupt 17 has happened.

When the interrupt handler is finished, the iret command is returned. According to the command iret, the stack should be the same as it was at the time of the interrupt (see the figure above). As a result, all values added to the stack by the interrupt handler must be popped. Before returning, iret restores eflags by popping the value off the stack, and then jumps to cs:eip as defined.

Because all registers used by interrupt handlers must be maintained by pushing them onto the stack, the interrupt handler must be written in assembly code. This is because the interrupted code is unaware of the interruption and so expects its registers to remain unchanged. It will be tedious to write all of the interrupt handler functionality in assembly code. It’s a good idea to write an assembly handler that saves the registers, runs a C function, restores the registers, and then executes iret.

The following code says how I created the C handler in interrupt. h:

#ifndef INCLUDE_INTERRUPTS
#define INCLUDE_INTERRUPTS

struct IDT 
{
	unsigned short size;
	unsigned int address;
} __attribute__((packed));

struct IDTDescriptor {
	/* The lowest 32 bits */
	unsigned short offset_low; // offset bits 0..15
	unsigned short segment_selector; // a code segment selector in GDT or LDT
	
	/* The highest 32 bits */
	unsigned char reserved; // Just 0.
	unsigned char type_and_attr; // type and attributes
	unsigned short offset_high; // offset bits 16..31
} __attribute__((packed));

void interrupts_install_idt();

// Wrappers around ASM.
void load_idt(unsigned int idt_address);
void interrupt_handler_33();
void interrupt_handler_14();

struct cpu_state {
	unsigned int eax;
	unsigned int ebx;
	unsigned int ecx;
	unsigned int edx;
	unsigned int ebp; 
	unsigned int esi; 
	unsigned int edi; 
} __attribute__((packed));

struct stack_state {
	unsigned int error_code;
	unsigned int eip;
	unsigned int cs;
	unsigned int eflags;
} __attribute__((packed));

void interrupt_handler(struct cpu_state cpu, unsigned int interrupt, struct stack_state stack);


#endif /* INCLUDE_INTERRUPTS */

Creating a Generic Interrupt Handler

Because the CPU does not push the interrupt number into the stack, creating an universal interrupt handler is difficult. This section will show you how to use macros to accomplish that. Instead of writing one version for each interrupt, it’s easier to leverage NASM’s macro feature. Because not all interruptions generate an error code, the number 0 will be used as the “error code” for interrupts that do not generate one. An example of how this may be done is shown in the code below:

;Generic Interrupt Handler
;
extern interrupt_handler

%macro no_error_code_interrupt_handler 1
global interrupt_handler_%1
interrupt_handler_%1:
	push	dword 0                     ; push 0 as error code
	push	dword %1                    ; push the interrupt number
	jmp	common_interrupt_handler    ; jump to the common handler
%endmacro

%macro error_code_interrupt_handler 1
global interrupt_handler_%1
interrupt_handler_%1:
	push    dword %1                    ; push the interrupt number
	jmp     common_interrupt_handler    ; jump to the common handler
%endmacro

common_interrupt_handler:               ; the common parts of the generic interrupt handler
	; save the registers
	push    eax
	push    ebx
	push	ecx
	push	edx
	push	ebp
	push	esi
	push	edi

        ; call the C function
        call    interrupt_handler

        ; restore the registers
	pop	edi
	pop	esi
	pop	ebp
	pop	edx
	pop	ecx
	pop	ebx
        pop     eax

	; restore the esp
	add     esp, 8

	; return to the code that got interrupted
	iret

no_error_code_interrupt_handler	33	; create handler for interrupt 1 (keyboard)

The common_interrupt_handler does the following:

>Push the registers on the stack.
>Call the C function interrupt_handler.
>Pop the registers from the stack.
>Add 8 to esp (because of the error code and the interrupt number pushed earlier).
>Execute iret to return to the interrupted code.

Since the macros declare global labels the addresses of the interrupt handlers can be accessed from C or assembly code when creating the IDT.

Interrupt.c (accessing interrupt):

#include "interrupts.h"
#include "pic.h"
#include "io.h"

#include "serial.h"
#include "keyboard.h"

#define INTERRUPTS_DESCRIPTOR_COUNT 256 
#define INTERRUPTS_KEYBOARD 33 
#define INTERRUPTS_PAGING 14 

struct IDTDescriptor idt_descriptors[INTERRUPTS_DESCRIPTOR_COUNT];
struct IDT idt;

void interrupts_init_descriptor(int index, unsigned int address)
{
	idt_descriptors[index].offset_high = (address >> 16) & 0xFFFF; // offset bits 0..15
	idt_descriptors[index].offset_low = (address & 0xFFFF); // offset bits 16..31

	idt_descriptors[index].segment_selector = 0x08; // The second (code) segment selector in GDT: one segment is 64b.
	idt_descriptors[index].reserved = 0x00; // Reserved.

	/*
	   Bit:     | 31              16 | 15 | 14 13 | 12 | 11     10 9 8   | 7 6 5 | 4 3 2 1 0 |
	   Content: | offset high        | P  | DPL   | S  | D and  GateType | 0 0 0 | reserved
		P	If the handler is present in memory or not (1 = present, 0 = not present). Set to 0 for unused interrupts or for Paging.
		DPL	Descriptor Privilige Level, the privilege level the handler can be called from (0, 1, 2, 3).
		S	Storage Segment. Set to 0 for interrupt gates.
		D	Size of gate, (1 = 32 bits, 0 = 16 bits).
	*/
	idt_descriptors[index].type_and_attr =	(0x01 << 7) |			// P
						(0x00 << 6) | (0x00 << 5) |	// DPL
						0xe;				// 0b1110=0xE 32-bit interrupt gate
}

void interrupts_install_idt()
{
	interrupts_init_descriptor(INTERRUPTS_KEYBOARD, (unsigned int) interrupt_handler_33);
	interrupts_init_descriptor(INTERRUPTS_PAGING, (unsigned int) interrupt_handler_14);


	idt.address = (int) &idt_descriptors;
	idt.size = sizeof(struct IDTDescriptor) * INTERRUPTS_DESCRIPTOR_COUNT;
	load_idt((int) &idt);

	/*pic_remap(PIC_PIC1_OFFSET, PIC_PIC2_OFFSET);*/
	pic_remap(PIC_1_OFFSET, PIC_2_OFFSET);
}


/* Interrupt handlers ********************************************************/

void interrupt_handler(__attribute__((unused)) struct cpu_state cpu, unsigned int interrupt, __attribute__((unused)) struct stack_state stack)
{
	unsigned char scan_code;
	unsigned char ascii;

	switch (interrupt){
		case INTERRUPTS_KEYBOARD:

			scan_code = keyboard_read_scan_code();

			if (scan_code <= KEYBOARD_MAX_ASCII) {
				ascii = keyboard_scan_code_to_ascii(scan_code);
				serial_configure_baud_rate(SERIAL_COM1_BASE, 4);
				serial_configure_line(SERIAL_COM1_BASE);
				char str[1];
				str[0] = ascii;
				serial_write(str, 1);
			}

			pic_acknowledge(interrupt);

			break;
		
	
		default:
			break;
    }
}

Loading the IDT

The IDT is loaded with the lidt assembly code instruction, which takes the address of the table’s first element. The simplest way is to wrap this instruction and use it from C.

global  load_idt

; load_idt - Loads the interrupt descriptor table (IDT).
; stack: [esp + 4] the address of the first entry in the IDT
;        [esp    ] the return address

load_idt:
        mov eax, [esp + 4]
        lidt [eax]
        ret

Programmable Interrupt Controller (PIC)

Before you can use hardware interrupts, you must first set up the Programmable Interrupt Controller (PIC). The PIC can be used to map hardware signals to interrupts. The reasons for customizing the PIC are as follows:

•Reconfigure the interrupts if necessary. The PIC uses interrupts 0 through 15 for hardware interrupts by default, which conflict with the CPU interrupts. As a result, the PIC interrupts must be remapped to a new time interval.

•You have the option of selecting which interruptions you want to receive. Because you don’t have any code to deal with interrupts from all devices, you probably don’t want to accept them.

•Choose the proper PIC mode.
At originally, there was only one PIC (PIC 1) and eight interruptions. When more circuitry was added, eight interrupts were no longer enough. The method used was to combine the initial PIC with a second PIC (PIC 2). (see PIC 1 interrupt 2).

The following table lists the hardware interrupts:

--------------------------------------------------------------------
  PIC 1   |  Hardware         |    PIC 2   |  Hardware                      --------------------------------------------------------------------           
       0  |  Timer            |         8  |  Real Time Clock 
       1  |  Keyboard         |         9  |  General I/O 
       2  |  PIC 2            |        10  |  General I/O 
       3  |  COM 2            |        11  |  General I/O 
       4  |  COM 1            |        12  |  General I/O 
       5  |  LPT 2            |        13  |  Coprocessor 
       6  |  Floppy disk      |        14  |  IDE Bus 
       7  | LPT 1             |        15  |  IDE Bus

A good guide for setting the PIC can be found on the SigOPS website. It is not necessary to repeat that information here. Every PIC interrupt must be acknowledged, which implies that a message confirming that the interrupt has been handled must be delivered to the PIC. The PIC will stop generating interrupts if this is not done. The interrupt is acknowledged by sending the byte 0x20 to the PIC that caused it.

As a result, implementing a pic acknowledge function is as follows:

/*                      I/O port */
#define PIC_1		0x20		/* IO base address for master PIC */
#define PIC_2		0xA0		/* IO base address for slave PIC */
#define PIC_1_COMMAND	PIC_1
#define PIC_1_DATA	(PIC_1+1)
#define PIC_2_COMMAND	PIC_2
#define PIC_2_DATA	(PIC_2+1)

#define PIC_1_OFFSET 0x20
#define PIC_2_OFFSET 0x28
#define PIC_2_END PIC_2_OFFSET + 7

#define PIC_1_COMMAND_PORT 0x20
#define PIC_2_COMMAND_PORT 0xA0
#define PIC_ACKNOWLEDGE 0x20

#define PIC_ICW1_ICW4            0x01	/* ICW4 (not) needed */
#define PIC_ICW1_SINGLE          0x02	/* Single (cascade) mode */
#define PIC_ICW1_INTERVAL4       0x04	/* Call address interval 4 (8) */
#define PIC_ICW1_LEVEL           0x08	/* Level triggered (edge) mode */
#define PIC_ICW1_INIT            0x10	/* Initialization - required! */

#define PIC_ICW4_8086            0x01	/* 8086/88 (MCS-80/85) mode */
#define PIC_ICW4_AUTO            0x02	/* Auto (normal) EOI */
#define PIC_ICW4_BUF_SLAVE       0x08	/* Buffered mode/slave */
#define PIC_ICW4_BUF_MASTER      0x0C	/* Buffered mode/master */
#define PIC_ICW4_SFNM            0x10	/* Special fully nested (not) */

void pic_remap(int offset1, int offset2);
void pic_acknowledge(unsigned int interrupt);

void pic_acknowledge(unsigned int interrupt)
{
	if (interrupt < PIC_1_OFFSET || interrupt > PIC_2_END) {
		return;
	}

	if (interrupt < PIC_2_OFFSET) {
		outb(PIC_1_COMMAND_PORT, PIC_ACKNOWLEDGE);
	} else {
		outb(PIC_2_COMMAND_PORT, PIC_ACKNOWLEDGE);
	}
}
void pic_remap(int offset1, int offset2)
{
	outb(PIC_1_COMMAND, PIC_ICW1_INIT + PIC_ICW1_ICW4);	// starts the initialization sequence (in cascade mode)
	outb(PIC_2_COMMAND, PIC_ICW1_INIT + PIC_ICW1_ICW4);
	outb(PIC_1_DATA, offset1);				// ICW2: Master PIC vector offset
	outb(PIC_2_DATA, offset2);				// ICW2: Slave PIC vector offset
	outb(PIC_1_DATA, 4);					// ICW3: tell Master PIC that there is a slave PIC at IRQ2 (0000 0100)
	outb(PIC_2_DATA, 2);					// ICW3: tell Slave PIC its cascade identity (0000 0010)

	outb(PIC_1_DATA, PIC_ICW4_8086);
	outb(PIC_2_DATA, PIC_ICW4_8086);

        // Setup Interrupt Mask Register (IMR)
	outb(PIC_1_DATA, 0xFD); // 1111 1101 - Enable IRQ 1 only (keyboard).
	outb(PIC_2_DATA, 0xFF);

	asm("sti"); // Enable interrupts.
}

Reading Input from the Keyboard

Instead of ASCII characters, the keyboard generates scan codes. When a button is pressed or released, a scan code is used to characterize it. The scan code for the recently pressed button can be found on the keyboard’s data I/O port, which has the address 0x60. The following example shows how this can be accomplished:

All the things implement by following C code Keyboard. c file:

#include "io.h"
#define KEYBOARD_MAX_ASCII 83 
#define KEYBOARD_DATA_PORT 0x60

unsigned char keyboard_read_scan_code(void)
{
	return inb(KEYBOARD_DATA_PORT);
}

unsigned char keyboard_scan_code_to_ascii(unsigned char scan_code)
{
	unsigned char ascii[256] =
	{
		0x0, 0x0, '1', '2', '3', '4', '5', '6',		// 0 - 7
		'7', '8', '9', '0', '-', '=', 0x0, 0x0,		// 8 - 15
		'q', 'w', 'e', 'r', 't', 'y', 'u', 'i',		// 16 - 23
		'o', 'p', '[', ']', '\n', 0x0, 'a', 's',	// 24 - 31
		'd', 'f', 'g', 'h', 'j', 'k', 'l', ';',		// 32 - 39
		'\'', '`', 0x0, '\\', 'z', 'x', 'c', 'v',	// 40 - 47
		'b', 'n', 'm', ',', '.', '/', 0x0, '*',		// 48 - 55
		0x0, ' ', 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,		// 56 - 63
		0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, '7',		// 64 - 71
		'8', '9', '-', '4', '5', '6', '+', '1',		// 72 - 79
		'2', '3', '0', '.'				// 80 - 83
	};

	return ascii[scan_code];
}

The next step is to create a function that converts a scan code to an ASCII character. Andries Brouwer provides a nice guide if you wish to map the scan codes to ASCII characters like on an American keyboard. You must call pic to acknowledge the conclusion of the keyboard interrupt handler since the keyboard interrupt is raised by the PIC. Also, until you read the scan code from the keyboard, the keyboard will not send you any more interruptions.

Hope you have successfully implemented handling interrupts and inputs to your OS and hope to catch you in the next article.

Thank you!

Build your own OS (week 5)

Written by Asitha Nuwan