Linux Versus Classic Unix Kernels

Owing to their common ancestry and same API, modern Unix kernels share various design traits. With few exceptions, a Unix kernel is typically a monolithic static binary. That is, it exists as a large single-executable image that runs in a single address space. Unix systems typically require a system with a paged memory-management unit; this hardware enables the system to enforce memory protection and to provide a unique virtual address space to each process.

See the bibliography for my favorite books on the design of the classic Unix kernels.

Monolithic Kernel Versus Microkernel Designs

Operating kernels can be divided into two main design camps: the monolithic kernel and the microkernel. (A third camp, exokernel, is found primarily in research systems but is gaining ground in real-world use.)

Monolithic kernels involve the simpler design of the two, and all kernels were designed in this manner until the 1980s. Monolithic kernels are implemented entirely as single large processes running entirely in a single address space. Consequently, such kernels typically exist on disk as single static binaries. All kernel services exist and execute in the large kernel address space. Communication within the kernel is trivial because everything runs in kernel mode in the same address space: The kernel can invoke functions directly, as a user-space application might. Proponents of this model cite the simplicity and performance of the monolithic approach. Most Unix systems are monolithic in design.

Microkernels, on the other hand, are not implemented as single large processes. Instead, the functionality of the kernel is broken down into separate processes, usually called servers. Idealistically, only the servers absolutely requiring such capabilities run in a privileged execution mode. The rest of the servers run in user-space. All the servers, though, are kept separate and run in different address spaces. Therefore, direct function invocation as in monolithic kernels is not possible. Instead, communication in microkernels is handled via message passing: An interprocess communication (IPC) mechanism is built into the system, and the various servers communicate and invoke "services" from each other by sending messages over the IPC mechanism. The separation of the various servers prevents a failure in one server from bringing down another.

Likewise, the modularity of the system allows one server to be swapped out for another. Because the IPC mechanism involves quite a bit more overhead than a trivial function call, however, and because a context switch from kernel-space to user-space or vice versa may be involved, message passing includes a latency and throughput hit not seen on monolithic kernels with simple function invocation. Consequently, all practical microkernel-based systems now place most or all the servers in kernel-space, to remove the overhead of frequent context switches and potentially allow for direct function invocation. The Windows NT kernel and Mach (on which part of Mac OS X is based) are examples of microkernels. Neither Windows NT nor Mac OS X run any microkernel servers in user-space in their latest versions, defeating the primary purpose of microkernel designs altogether.

Linux is a monolithic kernelthat is, the Linux kernel executes in a single address space entirely in kernel mode. Linux, however, borrows much of the good from microkernels: Linux boasts a modular design with kernel preemption, support for kernel threads, and the capability to dynamically load separate binaries (kernel modules) into the kernel. Conversely, Linux has none of the performance-sapping features that curse microkernel designs: Everything runs in kernel mode, with direct function invocationnot message passingthe method of communication. Yet Linux is modular, threaded, and the kernel itself is schedulable. Pragmatism wins again.

As Linus and other kernel developers contribute to the Linux kernel, they decide how best to advance Linux without neglecting its Unix roots (and more importantly, the Unix API). Consequently, because Linux is not based on any specific Unix, Linus and company are able to pick and choose the best solution to any given problemor at times, invent new solutions! Here is an analysis of characteristics that differ between the Linux kernel and other Unix variants:

Linux supports the dynamic loading of kernel modules. Although the Linux kernel is monolithic, it is capable of dynamically loading and unloading kernel code on demand.
Linux has symmetrical multiprocessor (SMP) support. Although many commercial variants of Unix now support SMP, most traditional Unix implementations did not.
The Linux kernel is preemptive. Unlike traditional Unix variants, the Linux kernel is capable of preempting a task even if it is running in the kernel. Of the other commercial Unix implementations, Solaris and IRIX have preemptive kernels, but most traditional Unix kernels are not preemptive.
Linux takes an interesting approach to thread support: It does not differentiate between threads and normal processes. To the kernel, all processes are the samesome just happen to share resources.
Linux provides an object-oriented device model with device classes, hotpluggable events, and a user-space device filesystem (sysfs).
Linux ignores some common Unix features that are thought to be poorly designed, such as STREAMS, or standards that are brain dead.
Linux is free in every sense of the word. The feature set Linux implements is the result of the freedom of Linux's open development model. If a feature is without merit or poorly thought out, Linux developers are under no obligation to implement it. To the contrary, Linux has adopted an elitist attitude toward changes: Modifications must solve a specific real-world problem, have a sane design, and have a clean implementation. Consequently, features of some other modern Unix variants, such as pageable kernel memory, have received no consideration.

Despite any differences, Linux remains an operating system with a strong Unix heritage.