This is an old revision of the document!

CPU Partitioning

CPUs can be partitioned to separate the resources of tasks and interrupts with different focus. In a real time system, CPU partitioning can be used to separate CPUs dedicated to real time tasks and their corresponding interrupts.

The base technology for CPU partitioning is CPU affinity. On top of this mechanism further Linux kernel facilities for CPU partitioning are implemented. User space tooling is available as well.

This article gives an short overview about the facilities and tools. Follow the links for detailed information.

Affinity

The processing of tasks or interrupts can be restricted to a specified set of CPUs by setting the affinity. The task CPU affinity affects the scheduler and makes sure that the specific task is executed only on the CPUs which are in the tasks affinity set. The IRQ affinity specifies to which CPU an interrupt is allowed to be routed.

CPU affinity of tasks

In a SMP system the property that binds processes or tasks to one or more processors by the OS scheduler is known as CPU affinity, the capability to override how the processes or tasks are assigned to a particular set of processors by the scheduler is a feature available in several OSes. The idea is to say “always run this process/task on processor one” or “run these processes/tasks on all processors but processor zero”. The scheduler places the processes/tasks on the CPUs which are contained in the affinity set.

Task affinity management can be utilized via the following mechanisms:

cgroups: cpusets

CPU isolation: see CONFIG_CPU_ISOLATION and command line parameter isolcpus.

System calls: sched_[get/set]affinity

Tools: taskset

The CPU affinity of per-CPU threads like ksoftirqd/n and kworker/n (where n is the core number) is not settable. Other threads like kswapd/n are per-NUMA node and can be only pinned within the cores of their node.

CPU affinity and kworkers

Kworker threads and the workqueue tasks which they perform are a special case. While it is possible rely on taskset and sched_setaffinity() to manage kworkers, doing so is of little utility since the threads are often short-lived and, at any rate, often perform a wide variety of work. The paradigm with workqueues is instead to associate an affinity setting with the task itself. “Unbound” is the name for workqueues which are not per-CPU. These workqueues consume a lot of CPU time on many systems and tend to present the greatest management challenge for latency control. Those unbound workqueues which appear in /sys/devices/virtual/workqueue are configurable from userspace. The parameters affinity_scope, affinity_strict and cpu_mask together determine on which cores the kworker which executes the work function will run. Many unbound workqueues are not configurable via sysfs. Making their properties visible there requires an additional WQ_SYSFS flag in the kernel source.

Since kernel 6.5, the tools/workqueue/wq_monitor.py Python script is available in-tree, and since 6.6, wq_dump.py has joined it. These Python scripts require the drgn debugger, which is packaged by major Linux distributions. Another recent addition of potential particular interest for the realtime project is wqlat.py, which is part of the bcc/tools suite (see https://github.com/iovisor/bcc/blob/master/tools/wqlat.py). Both sets of tools may require special kernel configuration settings.

IRQ affinity

Hardware interrupts can interrupt kernel and user space computations at any given time, except when the kernel disables interrupt processing to protect resources. When a hardware interrupt is handled the CPU switches into a separate context and executes the handler code and switches back to the interrupted context and resumes the execution.

Depending on the interrupt hardware, interrupts can be routed to any CPU or delivery can be rotated between CPUs. Most interrupt controllers allow to restrict the set of CPUs to which a particular interrupt can be delivered by setting the IRQ affinity.

When the CPU receives an interrupt, a context switch to interrupt context is executed and the current task has to wait until the IRQ is handled. The possibility to allow only a set of CPUs to handle dedicated IRQ is called IRQ affinity. Thereby the hardware routing of the interrupt to the CPUs is affected.

IRQ affinity management can be utilized via the following mechanisms:

procfs: procfs

Kernel command line parameter: Default IRQ affinity

Tools: irqbalanced

taskset

Realtime application example

and fill it with content.

Wiki

Table of Contents

CPU Partitioning

Affinity

CPU affinity of tasks

CPU affinity and kworkers

IRQ affinity

Realtime application example

Wiki

User Tools

Site Tools

Table of Contents

CPU Partitioning

Affinity

CPU affinity of tasks

CPU affinity and kworkers

IRQ affinity

Realtime application example

Page Tools