|CS 321 Spring 2013 > Lecture Notes for Monday, January 28, 2013|
The kernel is the core code of an operating system. It includes the lowest-level code, and it provides the basic abstractions that all other code requires.
Modern processors can execute in various privilege modes. At the very least, a processor must allow for user mode and supervisor mode (another term used is kernel mode). The modes differ in the privileges allowed. Essentially, supervisor mode can do anything the processor is capable of, while user mode, or other lower-privilege modes, are limited to certain address spaces and operations.
As one might expect, the basic idea is that an OS kernel executes in kernel mode, and other code, including application code, executes in user mode. Different OS designs differ in where this user-kernel boundary is drawn: what are the responsibilities of the kernel, and what code lies in it.
The portion of an OS that runs in user mode is often called its userland. (This term can also be used to refer to all code that runs in user mode.)
The first, and, historically, most common, kernel design is that of a monolithic kernel. In this design, essentially all non-application code resides in the kernel and executes at high privilege: interrupt handlers, system call implementation, memory management, process scheduling, device drivers. The kernel is a single large executable, and all of this code is linked together.
Today, the primary example of a monolithic design is the Linux kernel. This kernel is the core of the various OSs known as either “GNU/Linux” or (somewhat incorrectly) just “Linux”. It is also used in the Android OS.
Because monolithic kernels have a large amount of code linked together, and executing at high privilege, they can suffer in terms of maintainability. Another problem is that, since so much code executes at high privilege, bugs can have severe effects. Thus, security and robustness can also be issues.
The disadvantages of a monolithic kernel can be mitigated somewhat using a layered design. Such a kernel is divided into logical layers, each of which provides abstractions for the layers above it. For example, the lowest layer might handle memory management and process scheduling; it would provide the abstractions of process and address space to all the layers above it.
In order to deal with the disadvantages of monolithic kernels, in the 1980s the idea was developed that a kernel should be very small, with most OS code outside the kernel. Such a minimalist kernel is a microkernel. Typically, a microkernel does memory management, process scheduling, and interprocess communication. All other OS (and non-OS) code executes at a lower privilege level, including device drivers and system call implementation. Thus, most of what a microkernel does is message passing; the kernel functions primarily as an intermediary between modules running outside of it.
Two examples of microkernels are the GNU Hurd kernel and its ancestor the Mach kernel (developed as a research project at Carnegie-Mellon University).
Historically, a major problem with microkernels is performance, due largely to the high number of switches between user mode and kernel mode that are required. Recent research has reportedly mitigated this problem, however.
When we use a microkernel, it is helpful to organize the non-kernel code using a client-server paradigm. A client is code that requires a service to be performed. A server is the code that provides the service. The microkernel then acts as a go-between, ensuring that the proper servers are called.
Note: This idea is not discussed in the text.
There is a continuum between the monolithic kernel and the microkernel. Drawing the user-kernel boundary at a low level gives us a small kernel with few responsibilities, and makes security and maintainability easier. Drawing the boundary at a high level gives us a large kernel that can be more efficient. Perhaps we want to draw the boundary somewhere in the middle; the result is a hybrid kernel.
Two prominent examples of hybrid kernels are the XNU kernel used in NeXTSTEP and MacOS X/Darwin (XNU is a derivative of the Mach kernel, a microkernel), and the kernels used in the various Windows OSs.
A virtual machine is a complete abstraction of the processor, possibly with higher-level abstractions included.
So, for example, a virtual machine might allow one operating system to run within another. Or it might allow one computer to simulate another.
Virtual machines are a very successful and fruitful idea in the modern OS world. For example, the Java programming language was designed to run on top of a specially designed virual machine, the Java Virtual Machine (JVM), which provides OS-like services to applications. The idea is that a Java program could be compiled once, and then run on any machine, under any OS. The JVM has turned out to be a success even apart from the Java programming language. Some programming languages (Scala, Clojure, Groovy) have been specifically targeted at the JVM, while other languages have a JVM-based implementation (JRuby, Jython).
Another direction virtual machines have taken is simply to simulate ordinary hardware, allowing one OS to be run on top of another. Popular packages allowing for this include various products by WMware and VirtualBox.
Of course, virtual machines can have serious performance problems. On the other hand, improving the performance of the virtual machine, will improve the performance of all programs in all languages that run on it. On the plus side, virtual machines allow for highly secure and consistent programming environments. In some case, a virtual machine can be stopped, saved and then restarted later, perhaps on a different machine.
Today, the idea of a microkernel seems to have been largely abandoned for production OSs. However, small kernels are still an active research area. The idea of a microkernel has been carried even further, producing kernels with names like “nanokernel” and “picokernel”.
Perhaps the most extreme effort in this direction is exokernel, a research project at MIT. This project envisions a kernel as doing nothing other than controlling access to resources. An exokernel thus answers two questions:
All other functionality lies outside the kernel.
Such an extremely minimal kernel provides almost no abstractions to the code that uses it. An advantage of this is that code that depends on very different abstractions is able to run together. For example, a single exokernel might support multiple virtual machines.