This is an old revision of the document!
One way to identify where SMIs happen is to look for inexplicably long gaps in a kernel function trace, as these usually indicate that an abnormally long amount of time was spent in system management mode. Cyclictest's break trace and function tracing options can be used to produce a trace which can then be analyzed to distinguish the latencies that could be caused by SMIs from the other latencies. This trace analyzing technique is interesting because it can be used to identify firmware or hardware related latencies on any architecture that supports tracing, not just SMIs on x86 processors.
This page explains how to correctly produce a function trace using Cyclictest and how to identify SMI related latencies in the trace. Additional details that clarify some of the topics mentioned more briefly on this page can be found in the general documentation about debugging using Cyclictest here.
Analyzing a function trace that contains a problematic latency as well as the kernel code that is executing when it happens is an important step in latency debugging. Latencies caused by the kernel are much more common than latencies caused by firmware or hardware, so the possibility of a kernel related latency should be eliminated before any further investigation. On the other hand, if a latency is indeed SMI related then knowing the function calls that are made around the SMI can help determine what is triggering it.
Before tracing the kernel to identify latencies, it is important to make sure that the CPUs will never go idle. When a CPU enters an idle state there can be long periods of time where the processor effectively does nothing. These idle periods show up in function traces as additional latencies (unusually large gaps between consecutive tracepoints). The latencies in the trace that are caused by the time a CPU spends in idle do not add information that helps with the debugging process because they come from a known source. These irrelevant latencies need to be filtered during the trace analysis which makes this step take longer unnecessarily. It is easier to just avoid having them in the trace in the first place by preventing the CPUs from going idle. Some simple ways of keeping the CPUs busy are described here.
Once the CPUs are sure to never idle, a trace can be obtained using the break trace and the function tracing options. The break trace option (–breaktrace) is intended to be used with one of the Cyclictest options that enables tracing with a specific Ftrace tracer. This option will stop tracing (break tracing) and end Cyclictest if it detects a latency that is longer than the limit which is specified when using the option. For this SMI detection technique, the function tracing option (–ftrace) is used in order to have all the details about what the kernel is doing when the latency occurs.
Ftrace might not behave the same way after using Cyclictest tracing options. When these options are used, Cyclictest changes some Ftrace options to get the desired trace format. The Ftrace options that were set before are not restored after Cyclictest runs.
Below is an example of a Cyclictest command that includes the break trace and function tracing options. The '#' before the command is to indicate that the command must be run as root, with sudo, or as a member of the realtime group.
# cyclictest --mlockall --smp --priority=80 --interval=200 --distance=0 --breaktrace=150 --ftrace
The combination of options in this example is not appropriate for every test situation.
More details about how to choose the right options for measuring a specific latency on a given system can be found in the Cyclictest documentation here.
Remember to adjust for the tracing overhead when specifying the break trace limit.
Function tracing slows down the execution of the program so the latency will most likely be larger after enabling tracing. It is important to adjust for this additional overhead because otherwise the tracing may not break on the correct latency and the trace may not contain the latency that was supposed to be observed. More details about how to adjust the break trace limit can be found in the documentation about debugging using Cyclictest here.
Once a trace that includes the correct latency is obtained, it needs to be analyzed critically in order to determine the latencies that could be caused by SMIs. First, the latencies in the trace must be identified. Because the trace contains every function call made by the kernel, the timestamps for consecutive tracepoints should only be a few microseconds apart. So, in theory any gap between function calls that is longer than a few microseconds could potentially indicate a problem. That being said, it is generally best to start by finding and analyzing the longest latencies in the trace.
The longer latencies in the trace should have certain special characters next to their timestamps. For example, a latency of more than 50 us is indicated with a '+' and a latency of more than 100 us is indicated with a '!'. In the following trace excerpt there is an exclamation mark next to the timestamp of the second function call because the second and the third native_io_delay calls are more than 100 us apart.
bash-960 0d...111 38067191us : native_io_delay <-pcspkr_event bash-960 0d...111 38067200us!: native_io_delay <-pcspkr_event bash-960 0d...111 38067346us : native_io_delay <-pcspkr_event
The second call was made at 38067200 us and the third call was at 38067346 us, so there was an unusually long 146 us delay between the two calls. The Ftrace documentation can be consulted to see the characters associated with even longer latencies.
The next step in the trace analysis is to separate the unsuspicious latencies from the latencies that could be caused by SMIs. To do this, eliminate all the latencies that can be explained by something. A couple ways this can be done is by analyzing the situation in which they occur or by looking at the source code for the function that was executing when the latency happened. Some possible explanations for large latencies in the trace are that the CPU is idle, uninstrumented userspace code is executing, the kernel function that is called disables interrupts and then does something that takes a long time, etc. If the latency cannot be explained by the situation or by the code and the OS appears to lose the CPU for no reason, then the latency is probably caused by an SMI.