This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
realtime:documentation:howto:applications:application_base [2017/02/08 13:40] anna-maria created; moved into new namespace |
realtime:documentation:howto:applications:application_base [2024/05/31 15:54] (current) alison [Scheduling and priority] |
||
---|---|---|---|
Line 14: | Line 14: | ||
==== Scheduling and priority ==== | ==== Scheduling and priority ==== | ||
- | The [[realtime:documentation:technical_basics:sched_policy_prio|scheduling policy]] as well as the priority | + | The [[realtime:documentation:technical_basics:sched_policy_prio:start|scheduling policy]] as well as the priority |
must be set by the application explicitly. There are two possibilities | must be set by the application explicitly. There are two possibilities | ||
for this: | for this: | ||
Line 30: | Line 30: | ||
the pthread attributes and not to use the inherit scheduling of the | the pthread attributes and not to use the inherit scheduling of the | ||
thread which created the real-time thread. | thread which created the real-time thread. | ||
+ | </wrap> | ||
+ | - **Problems with pthread condition variables** \\ <wrap> | ||
+ | Multithreaded applications which rely on glibc's libpthread are prone to unexpected latency delays since its condition variable implementation does not honor priority inheritance ([[https://sourceware.org/bugzilla/show_bug.cgi?id=11588|bugzilla]]). Unfortunately glibc's DNS resolver and asynchronous I/O implementations depend in turn on these condition variables. | ||
+ | [[https://github.com/dvhart/librtpi|librtpi]] is an alternative LGPL-licensed pthread implementation which supports priority inheritance, and whose API is as close to glibc's as possible. The alternative [[https://www.musl-libc.org/|MUSL libc]] has a pthread condition variable implementation similar to glibc's. | ||
</wrap> | </wrap> | ||
==== Memory locking ==== | ==== Memory locking ==== | ||
- | In real-time applications it is important to avoid non-deterministic | + | See [[realtime:documentation:howto:applications:memory#Memory Locking | here]] |
- | behavior. If the memory that is needed by the real-time application | + | |
- | is not locked in the RAM, this memory could be paged out. If the | + | |
- | memory is not paged in when the application tries to access the | + | |
- | memory, a page fault occurs causing non-deterministic high latency. | + | |
- | For this reason memory should be locked in real-time applications. | + | |
- | The memory lock persists until the process owning it terminates or | + | |
- | explicitly unlocks it by calling ''munlock()'' or ''munlockall()''. | + | |
- | Be aware that page faults due to paged out memory occur in systems | + | |
- | with swap as well as in systems without swap. In addition, the binary | + | |
- | of the executed application itself could be paged out. | + | |
- | The following call of ''mlockall()'' locks all current pages mapped | + | ==== Stack for RT thread ==== |
- | into the address space of the process as well as all pages that will | + | |
- | be mapped in the future. | + | |
- | mlockall(MCL_CURRENT|MCL_FUTURE); | + | See [[realtime:documentation:howto:applications:memory#Stack Memory for RT threads | here]] |
- | ==== Stack prefaulting ==== | + | ==== Capabilities: running the app with RT priority as a non-root user ==== |
- | Since page faults cause non-deterministic behavior, the stack should | + | Several of the Pthread APIs, like ''mlockall()'', ''pthread_attr_setschedpolicy()'', by default and convention require root in order to successfully get their work done. Thus, RT apps - which need to set an RT sched policy and priority - are often run via ''sudo''. |
- | be prefaulted before the real-time critical section starts. | + | |
- | In case several real-time threads are used, it should be done for each | + | |
- | thread individually. In the following example, a memory block of a | + | |
- | certain size is allocated. All of its pages are touched to get them | + | |
- | mapped into RAM ensuring that no page faults occur later. | + | |
- | void *buffer; | + | There's a far better approach to this; ''sudo'' gives the process root capabilities. This interests hackers :-). |
+ | Instead, you should leverage the powerful POSIX **Capabilities** model! This way, the process (and threads) get _only_ the capabilities they require and nothing more. This follows the infosec best practice, the //principle of least privilege//. | ||
- | buffer = mmap(NULL, MSIZE, PROT_READ | PROT_WRITE, | + | Apps start out with no capabilities by default; also note that capabilities are a per-thread resource (essentially translating to bitmasks with the task structure,which is per-thread of course). Among the various capability bits, the man page on ''capabilities(7)'' shows that **''CAP_SYS_NICE''** is the appropriate capability to use in this circumstance; a snippet from the ''capabilities(7)'' man page reveals this: |
- | MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); | + | |
- | memset(&buffer, 0, MSIZE); | + | |
- | The prefaulted stack can be assigned to a thread. (''&attr'' is a | + | ... |
- | ''pthread_attr_t'' pointer; the pthread attribute needs to have been | + | **CAP_SYS_NICE** |
- | previously initialized): | + | * Lower the process nice value (nice(2), setpriority(2)) and change the nice value for arbitrary processes; |
+ | * **set real-time scheduling policies for calling process, and set scheduling policies and priorities for arbitrary processes (sched_setscheduler(2), sched_setparam(2), sched_setattr(2));** | ||
+ | * set CPU affinity for arbitrary processes (sched_setaffinity(2)); | ||
+ | * set I/O scheduling class and priority for arbitrary processes (ioprio_set(2)); | ||
+ | * apply migrate_pages(2) to arbitrary processes and allow processes to be migrated to arbitrary nodes; | ||
+ | * apply move_pages(2) to arbitrary processes; | ||
+ | * use the MPOL_MF_MOVE_ALL flag with mbind(2) and move_pages(2). ... | ||
- | pthread_attr_setstack(&attr, buffer, PTHREAD_STACK_MIN); | + | |
+ | **//Ok, great, but how exactly is this capability bit to be set on the app?// | ||
+ | ** | ||
+ | - One approach is to do so programatically, via the ''capget()/capset()'' system calls. (Note that's it's generally easier to use the libcap library wrappers, ''cap_[g|s]et_proc(3)'': [[https://man7.org/linux/man-pages/man3/cap_get_proc.3.html]]. This man page even provides a small example of doing so). | ||
+ | - Another easy way is to leverage systemd and run your app as a service; in the service unit, specify the capability (see the man page on systemd.exec(5); [[https://www.freedesktop.org/software/systemd/man/systemd.exec.html#Capabilities]]. | ||
+ | - Perhaps the easiest way: via the ''setcap(8)'' utility (it's man page: [[https://man7.org/linux/man-pages/man8/setcap.8.html]]). The setcap/getcap are typically part of the libcap package. For example: | ||
+ | ''sudo setcap CAP_SYS_NICE+eip <your-app-binary-executable>'' | ||
+ | |||
+ | You could put this line in the app Makefile (or equivalent). | ||
+ | (The ''getcap(8)'' utility can be used to verify that the 'dumb-capability' binary now has the ''CAP_SYS_NICE'' bit set)! | ||
+ | |||
+ | And you're all set to run it as non-root now, a much more secure approach. | ||
===== Example ===== | ===== Example ===== | ||
Line 80: | Line 83: | ||
* using a single pthread as RT thread | * using a single pthread as RT thread | ||
*/ | */ | ||
+ | |||
+ | #include <limits.h> | ||
+ | #include <pthread.h> | ||
+ | #include <sched.h> | ||
+ | #include <stdio.h> | ||
#include <stdlib.h> | #include <stdlib.h> | ||
- | #include <stdio.h> | ||
- | #include <time.h> | ||
- | #include <sched.h> | ||
#include <sys/mman.h> | #include <sys/mman.h> | ||
- | #include <string.h> | ||
- | #include <pthread.h> | ||
- | #include <limits.h> | ||
void *thread_func(void *data) | void *thread_func(void *data) | ||
Line 95: | Line 96: | ||
return NULL; | return NULL; | ||
} | } | ||
+ | |||
int main(int argc, char* argv[]) | int main(int argc, char* argv[]) | ||
{ | { | ||
struct sched_param param; | struct sched_param param; | ||
- | void *stack_buf; | + | pthread_attr_t attr; |
pthread_t thread; | pthread_t thread; | ||
- | pthread_attr_t attr; | ||
int ret; | int ret; | ||
+ | |||
/* Lock memory */ | /* Lock memory */ | ||
if(mlockall(MCL_CURRENT|MCL_FUTURE) == -1) { | if(mlockall(MCL_CURRENT|MCL_FUTURE) == -1) { | ||
Line 109: | Line 109: | ||
exit(-2); | exit(-2); | ||
} | } | ||
- | |||
- | /* Pre-fault stack for the thread */ | ||
- | stack_buf = mmap(NULL, PTHREAD_STACK_MIN, PROT_READ | PROT_WRITE, | ||
- | MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); | ||
- | if (stack_buf == MAP_FAILED) { | ||
- | printf("mmap failed: %m\n"); | ||
- | exit(-1); | ||
- | } | ||
- | memset(stack_buf, 0, PTHREAD_STACK_MIN); | ||
/* Initialize pthread attributes (default values) */ | /* Initialize pthread attributes (default values) */ | ||
Line 126: | Line 117: | ||
} | } | ||
- | /* Set pthread stack to already pre-faulted stack */ | + | /* Set a specific stack size */ |
- | ret = pthread_attr_setstack(&attr, stack_buf, PTHREAD_STACK_MIN); | + | ret = pthread_attr_setstacksize(&attr, PTHREAD_STACK_MIN); |
if (ret) { | if (ret) { | ||
- | printf("pthread setstack failed\n"); | + | printf("pthread setstacksize failed\n"); |
- | goto out; | + | goto out; |
} | } | ||
Line 151: | Line 142: | ||
goto out; | goto out; | ||
} | } | ||
+ | |||
/* Create a pthread with specified attributes */ | /* Create a pthread with specified attributes */ | ||
ret = pthread_create(&thread, &attr, thread_func, NULL); | ret = pthread_create(&thread, &attr, thread_func, NULL); | ||
Line 158: | Line 149: | ||
goto out; | goto out; | ||
} | } | ||
+ | |||
/* Join the thread and wait until it is done */ | /* Join the thread and wait until it is done */ | ||
ret = pthread_join(thread, NULL); | ret = pthread_join(thread, NULL); | ||
if (ret) | if (ret) | ||
printf("join pthread failed: %m\n"); | printf("join pthread failed: %m\n"); | ||
+ | |||
out: | out: | ||
- | munmap(stack_buf, PTHREAD_STACK_MIN); | ||
return ret; | return ret; | ||
} | } |