Because the kernel is preemptive, a process in the kernel can stop running at any instant to allow a process of higher priority to run. This means a task can begin running in the same critical region as a task that was preempted. To prevent this, the kernel preemption code uses spin locks as markers of nonpreemptive regions. If a spin lock is held, the kernel is not preemptive. Because the concurrency issues with kernel preemption and SMP are the same, and the kernel is already SMP-safe, this simple change makes the kernel preempt-safe, too.
Or so we hope. In reality, some situations do not require a spin lock, but do need kernel preemption disabled. The most frequent of these situations is per-processor data. If the data is unique to each processor, there may be no need to protect it with a lock because only that one processor can access the data. If no spin locks are held, the kernel is preemptive, and it would be possible for a newly scheduled task to access this same variable, as shown here:
task A manipulates per-procesor variable foo, which is not protected by a lock task A is preempted task B is scheduled task B manipulates variable foo task B completes task A is rescheduled task A continues manipulating variable foo
Consequently, even if this were a uniprocessor computer, the variable could be accessed pseudo-concurrently by multiple processes. Normally, this variable would require a spin lock (to prevent true concurrency on multiprocessing machines). If this were a per-processor variable, however, it might not require a lock.
To solve this, kernel preemption can be disabled via preempt_disable(). The call is nestable; you may call it any number of times. For each call, a corresponding call to preempt_enable() is required. The final corresponding call to preempt_enable()re-enables preemption. For example:
preempt_disable(); /* preemption is disabled ... */ preempt_enable();
The preemption count stores the number of held locks and preempt_disable() calls. If the number is zero, the kernel is preemptive. If the value is one or greater, the kernel is not preemptive. This count is incredibly usefulit is a great way to do atomicity and sleep debugging. The function preempt_count() returns this value. See Table 9.9 for a listing of kernel preemptionrelated functions.
As a cleaner solution to per-processor data issues, you can obtain the processor number (which presumably is used to index into the per-processor data) via get_cpu(). This function disables kernel preemption prior to returning the current processor number:
int cpu; /* disable kernel preemption and set "cpu" to the current processor */ cpu = get_cpu(); /* manipulate per-processor data ... */ /* reenable kernel preemption, "cpu" can change and so is no longer valid */ put_cpu();