User Tools

Site Tools


realtime:documentation:howto:tools:cpu-partitioning:start

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
realtime:documentation:howto:tools:cpu-partitioning:start [2024/05/18 21:34]
alison [Housekeeping cores]
realtime:documentation:howto:tools:cpu-partitioning:start [2024/05/27 23:33] (current)
alison [Realtime application best practices]
Line 49: Line 49:
 Kworker threads and the workqueue tasks which they perform are a special case.   While it is possible rely on //taskset// and //​sched_setaffinity()//​ to manage kworkers, doing so is of little utility since the threads are often short-lived and, at any rate, often perform a wide variety of work.  The paradigm with workqueues is instead to associate an affinity setting with the task itself. ​  "​Unbound"​ is the name for workqueues which are not per-CPU. ​ These workqueues consume a lot of CPU time on many systems and tend to present the greatest management challenge for latency control. ​  Those unbound workqueues which appear in ///​sys/​devices/​virtual/​workqueue//​ are configurable from userspace. ​  The parameters //​affinity_scope//,​ //​affinity_strict//​ and //​cpu_mask//​ together determine on which cores the kworker which executes the work function will run.   Many unbound workqueues are not configurable via sysfs. ​ Making their properties visible there requires an additional //​WQ_SYSFS//​ flag in the kernel source. Kworker threads and the workqueue tasks which they perform are a special case.   While it is possible rely on //taskset// and //​sched_setaffinity()//​ to manage kworkers, doing so is of little utility since the threads are often short-lived and, at any rate, often perform a wide variety of work.  The paradigm with workqueues is instead to associate an affinity setting with the task itself. ​  "​Unbound"​ is the name for workqueues which are not per-CPU. ​ These workqueues consume a lot of CPU time on many systems and tend to present the greatest management challenge for latency control. ​  Those unbound workqueues which appear in ///​sys/​devices/​virtual/​workqueue//​ are configurable from userspace. ​  The parameters //​affinity_scope//,​ //​affinity_strict//​ and //​cpu_mask//​ together determine on which cores the kworker which executes the work function will run.   Many unbound workqueues are not configurable via sysfs. ​ Making their properties visible there requires an additional //​WQ_SYSFS//​ flag in the kernel source.
  
-Since kernel 6.5, the //​tools/​workqueue/​wq_monitor.py//​ Python script is available in-tree, and since 6.6,  //​wq_dump.py//​ has joined it.   These Python scripts require the //drgn// debugger, which is packaged by major Linux distributions. ​ Another recent addition of potential particular interest for the realtime project is //wqlat.py//, which is part of the //​bcc/​tools//​ suite (see https://​github.com/​iovisor/​bcc/​blob/​master/​tools/​wqlat.py).  Both sets of tools may require special kernel configuration settings.+Since kernel 6.5, the //​tools/​workqueue/​wq_monitor.py//​ Python script is available in-tree, and since 6.6,  //​wq_dump.py//​ has joined it.   These Python scripts require the //drgn// debugger, which is packaged by major Linux distributions. ​ Another recent addition of potential particular interest for the realtime project is [[https://​github.com/​iovisor/​bcc/​blob/​master/​tools/wqlat.py|wqlat.py]], which is part of the //​bcc/​tools//​ suite (see ).  Both sets of tools may require special kernel configuration settings.
  
 ==== IRQ affinity ==== ==== IRQ affinity ====
Line 82: Line 82:
 ==== Housekeeping cores ==== ==== Housekeeping cores ====
  
-A common paradigm with realtime systems is to pin latency-insenstive ​kernel and userspace tasks tasks on a designated "​housekeeping"​ core.   For example, taskset can pin kernel threads like kswapd and kauditd. ​ Applications whose networking ​is not critical may wish to pin network IRQs there. ​  ​Userspace threads which are sometimes CPU intensive like systemd and rsyslog may also be pinned on the housekeeping core.  Pinning userspace threads ​may not have the desired effect if much of their work is performed by [[https://​wiki.linuxfoundation.org/​realtime/​documentation/​howto/​tools/​cpu-partitioning/​start#​cpu_affinity_and_kworkers|unbound workqueues]],​ which may run on any core.+A common paradigm with realtime systems is to pin latency-insensitive ​kernel and userspace tasks tasks on a designated "​housekeeping"​ core.   For example, ​//taskset// can pin kernel threads like //kswapd// and //kauditd//.  Applications whose network traffic latency ​is not critical may wish to pin network IRQs there as well.   ​Userspace threads which are sometimes CPU-intensive like systemd and rsyslog may also be pinned on the housekeeping core.  Pinning userspace threads ​will not have the desired effect if much of their work is performed by [[https://​wiki.linuxfoundation.org/​realtime/​documentation/​howto/​tools/​cpu-partitioning/​start#​cpu_affinity_and_kworkers|unbound workqueues]],​ which may migrate to any core.
  
 +==== Softirqs and kthreads ====
  
-===== Realtime application example ===== +Softirqs are kernel threads which are often challenging to manage on realtime systems. ​ Softirqs may run in atomic context immediately following a hard IRQ which "​raises"​ them, or they may be executed in process context by per-CPU kernel threads called //​ksoftirqd/​n//,​ where //n// is the core number. ​  There are 10 kinds of softirqs which perform diverse tasks for the networking, block, scheduling, timer and [[https://​wiki.linuxfoundation.org/​realtime/​documentation/​technical_details/​rcu|RCU]] subsystems as well as executing callbacks for a large number of device drivers via the tasklet mechanism. ​  Only one softirq of any kind may be active at any given time on a core.   Thus if ksoftirqd is preempted by a hard IRQ, the associated soft interrupt is disabled from following ​it immediately,​ and must wait for ksoftirqd. ​  This unfortunate situation has been called "the new Big Kernel Lock" by realtime Linux maintainers.
- +
-FIXME and fill it with content. +
  
 +Kernel configuration allows system managers to move the NET_RX and RCU callbacks out of softirqs and into their own kthreads. ​ Since kernel 5.12, moving the NET_RX into its own kthread is possible by //​echo//​-ing '​1'​ into the //​threaded//​ sysfs attribute associated with a network device. ​ The process table will afterwards include a new kthread called //​napi/​xxx//,​ where xxx is the interface name. [Read more about the [[https://​wiki.linuxfoundation.org/​networking/​napi?​s[]=napi|NAPI]] mechanism in the networking wiki.] ​ Userspace may employ //taskset// to pin this kthread on any core.   ​Moving the softirq into its own kthread incurs a context-switch penalty, but even so may be worthwhile on systems where bursts of network traffic unacceptably delay applications. ​  ​[[https://​wiki.linuxfoundation.org/​realtime/​documentation/​technical_details/​rcu?​s[]=rcu#​rcu_callback_offloading|RCU Callback Offloading]] produces a new set of kthreads, and can be accomplished via a combination of compile-time configuration with boot-time command-line parameters.
  
realtime/documentation/howto/tools/cpu-partitioning/start.1716068042.txt.gz · Last modified: 2024/05/18 21:34 by alison