When a real-time system is said to have latencies or that it has a latency that needs to be debugged, the term latency means “a situation in which a task or an operation took too long to finish”. In a real-time context, the time that it takes a system to perform a particular task (i.e. the latency of the task) is important as each task must be able to finish before its specified deadline.
If an RT task does not complete within the specified time limit, it is as bad as if a task does not complete at all. This is why developers spend long amounts of time making sure that, even in the worst possible cases, a task can still complete before its deadline. This means that when evaluating the performance of a real-time system, the most important latency value to measure is the maximum latency.
In computing, the term latency is generally defined as “the time taken to perform a task”. As explained above, the term latency often means something slightly different when discussing an RT system. Having this second definition for the term latency can sometimes get a little confusing because the general definition of the term is also used when discussing real-time topics. So, it is important to remember that latency can mean slightly different things depending on how it is used.
More information: