Introduction to Real-Time Operating systems Part 2
Simon Southwell
Semi-retired logic, software and systems designer. Technical writer, mentor, educator and presenter.
Introduction
In the previous article an overview of real-time operating systems was given, how they fit on top of processor based system, and the sort of common features they have. The article then looked at one of these features, namely ‘tasks’, in more detail, discussing what they were. We looked at how they were scheduled and how they were prioritised so that a scheduler can make decisions about when to swap from running one task to running another. By default, this was done using ‘time-slicing’ via a real-time clock, but other reasons were seen that could make a task ‘block’, such as calling API functions delaying for some number of ticks, or waiting on data becoming available, and the scheduler would then run another active, unblocked task. This was all discussed under the examples of a RISC-V processor?based system and the FreeRTOS real-time operating system, and we will continue the discussions of this article in the same vein.
Now we are able to run separate tasks, each of which can control one (or more) of the functions in our SoC, we have a powerful solution for an embedded system. If each of the functions truly ran independently from each other, then this would be the only functionality we’d need. Each task would read and write to the memory mapped registers of the logic it was controlling, via driver software, using simple pointer access:
*ptrCtrlReg = newCtrlRefVal; // Write to control register
newStatus??= *pStatusReg;??// Read status register
In a real-world system, however, it is unlikely to be the case that all the tasks act alone. Therefore, they need to communicate with each other in various ways, from quite low level, to exchanging quite sophisticated data structures. At an even lower level, tasks may need to synchronise with each other without exchanging any further data. In this article I want to look at these mechanisms, and what FreeRTOS features are provided by way of examples.
Inter-Task Communication
In a multi-tasking system, in order to create a system as a whole, the running tasks may fall into different categories, and have priorities to match. For example, a task may be a “data collecting” task with high priority to quickly fetch data from a peripheral, say, when it becomes available, and send this to another lower priority tasks that is a “data processing” task, to use that data in some form, perhaps delayed until a certain amount has been gathered. Therefore, the tasks need to be able to exchange data. In FreeRTOS kernel, the features made available are in the following categories:
When and where the different types are used will be discussed in the rest of this section.
Queues
Queues are the basic form of inter-task data communication feature and are exactly what you’d expect them to be. They are basic first-in, first-out (FIFO) objects, with new data being sent to the back of the queue from a sender, and a receiver receiving the data in the order in which it was sent. In FreeRTOS, though, data can also be sent to the front of the queue—useful if, say, some high priority data needs to bypass lower priority data sitting in the queue. The task itself would have to keep track of what is already in the queue to make this assessment and choose to put at the front.
The Queue size will be finite, with a fixed number of items, and each item a fixed size. This is a defining characteristic of a queue, that all items in the queue are the same size, and defined when the queue is created. Therefore, a sending task will block if there is no space in the queue, and thus be descheduled until space becomes available. Similarly, a task trying to receive an item from an empty queue will also be blocked until one becomes available.
FreeRTOS provides a rich set of API calls for queues, with some of the highlights listed below:
Obviously, a queue needs to be created before we can use it, and this is done with xQueueCreate. The arguments for this define the number of items the queue can hold, plus the size of each item?(in bytes). It returns a handle to the queue for use in identifying it when sending or receiving data etc.
Note that there is an alternative version, xQueueCreateStatic. The former requires that dynamic allocation is configured for FreeRTOS (we will look at these when discussing configuration in the next article). If not, then xQueueCreateStatic must be used, and pointers to a couple of buffers of appropriate size provided—one to store all the items, and one to store the queue’s data structure. For the most part, dynamic allocation is enabled, and so buffering is handled by the OS. Other objects have similar variants in their creation API functions, but I will assume that dynamic allocation is enabled, as it is the simplest to use.
Once created, data can be sent, as we saw, either to the back (the normal situation) or to the front of the queue. The API function takes a pointer to the queue to send to as an argument, a pointer to a buffer containing the data to be sent, and also a number of ticks to wait. This last defines the maximum time the task should be blocked, before returning. If the data was sent successfully, the returned value indicates this (pdTRUE) or it will return with a ‘queue full’ status (errQUEUE_FULL) if the call timed out. If FreeRTOS is configured in a certain way, then the call can be set to block indefinitely.
You’ll notice that there are variants of these API calls with a suffix of FromISR, so a quick note on this is in order. Interrupt service routines, as we saw in the first article, have the highest priority, trumping all the task priorities. Because of this, ISRs should not block, potentially starving progress on the tasks. Many (but not all) of the API calls have variants for use from within ISRs. Taking xQueueSendToBackFromISR as an example, it behaves similarly to the non-ISR version, but takes no time out argument. Instead, it will return immediately if there is no space, with errQUEUE_FULL. To aid in scheduling, it does have another argument (which can be NULL) that is a pointer to return status as to whether sending the item unblocked a higher priority task, in which case the ISR should send a request to do a context switch (co-operatively yielding) before exiting.
The two receive API calls behave in a similar way, with identical arguments. Received queue items are copied, rather than sent by reference, and so a buffer pointer is still needed to receive the data. The basic operations on the queues are summarised by the diagram below:
Here are two cases of sending data to the back of a queue and to the front. The example shows a case where two items already exist in a queue of length 4, so has two used and two empty slots. The size of the items is unspecified (so let’s call it n) and will be the same for each item. In the first (upper) case, a third item (item2) is sent to the back of the queue and is placed logically behind item1. In the second (lower) case, a third item is sent to the front of the queue, which then sits as the first item of the queue, and the old items logically moved one place back. In both cases, a single space remains after the operation. The receiving task, if it subsequently calls the xQueueReceive API function, will get item0 in the first case, and item2 in the second case.
Another way to handle blocking or not, is to inspect the contents of the queue before sending or attempting to receive data. The uxQueueMessagesWaiting function (and its ISR equivalent), for instance, returns the number of items in the queue so and receiver can decide whether to call the receive API function or not. (These are, confusingly, called messages but are different from ‘messages’ that we will look at shortly. This is why I have used the name ‘item’). Another API call returns the number of spaces available in the queue so a sender can decide whether to make a call to the send API function, or not. Whether a task uses these, or simply allows itself to block, will depend on whether it can usefully do something else if it would block on a send or receive call.
As stated earlier, queues have fixed sized items. This could be just fixed ‘blocks’ of raw data, but can also be complex structures, so long as those structures are of fixed size—fundamentally, this is the same thing, with a structure just an interpretation of the data overlaid on top. Therefore, a queue would be used in just these circumstances, of fixed size sets of data to send from one task to another. One could, of course, send smaller amounts of data, perhaps with the first bytes indicating the size of the payload, and pad the rest of the items with some null value, such as 0. But this would be inefficient, and a better type of communication object would be more suitable, as we shall see.
There are other queue API functions not detailed here, with ‘peek’ facilities to fetch an item from the queue without deleting it, or the ability to reset the queue, deleting any items that were in it, as well as some other functionality. (See the FreeRTOS API documentation for queues.)
Queues can safely be written to from multiple tasks, allowing data transfer from multiple sources. The OS takes care of making this safe operation when writing from two tasks and will block one task if another task send operation is already in progress. If the task sources of write operations are truly asynchronous, the order in which items are written may not be deterministic, and external factors will determine the order of items written to the queue, such as the relative task priorities, the timing of any external events that instigate sending data to the queue etc. If the receiving task needs to know the origin of the data it is receiving, then the item data structures might contain a field with a task ID, or handle, that can be used to differentiate the data—but this is implementation specific. The diagram below summarises sending data from multiple sources:
One last topic on queues I want to briefly discuss is queue sets. This allows blocking on reading from multiple queues. A queue set can be created, and multiple queues added to the set (they must be empty at this point). A receiving task can then block on the set—that is, it will wait until there is at least one item in each of the queues associated with the set. Why is this useful??Well, it is useful, for example, when the control information is separated from the data itself, and maybe being received over different channels in the hardware, and even that the data may arrive before its corresponding control information. Then it is useful to have two separate queues for each data type. However, the receiver may need both sets in order to start processing the information, and so these two queues can be added to a queue set, and the task can block on the set until a pair of items is available. The queue set API calls are documented in detail, in the FreeRTOS API documentation.
Messages and Stream Buffers
Alternative means to send information between tasks is the use of message and stream buffers. These are very closely related and so I want to discuss them together.
The most obvious difference between messages and streams over queues is that they not fixed in size. The difference between messages and streams is that messages are passed as units of data, much like a queue item, but that each message can be of different size, whereas stream are a continuous set of bytes that can be sent and received. Actually, according to the FreeRTOS documentation, messages are in fact built on top of streams. Another difference is that the OS assumes that there is a single source (task or ISR) of data being sent to a single destination—in other words a point-to-point connection. There can be multiple sources set up but requiring the user code to make sure only one task/ISR is involved at any one time. API highlights are listed below, where TYPE is ether Stream or Message:
领英推荐
As for queues, streams and message buffers must be created, though they have slightly different arguments, but both return a handle for the buffer object. For both streams and messages, the first argument is a buffer size in bytes. This is the total number of bytes that the buffer can hold. For messages, this will include a length value added to each message, which might be 4 bytes on a 32-bit machine. So, if, for example, two messages are sent, one with 16 bytes, and another with 24 bytes, assuming they have not yet been read, this will occupy 20 + 28 = 48 bytes in the message buffer. An additional parameter is needed for the stream buffer creation to specify the trigger level number of bytes. This dictates the number of bytes in a stream buffer before a blocked receiving task is unblocked (or is unblocked on a time out).
There are variants of the buffer creation functions with the suffix WithCallback. These allow call back functions to be specified for when a send has completed (when reached the trigger level for streams, or when the message has been fully written to the buffer) and when data is received (either bytes read for streams, or a message read for messages). This, as we shall see in a later article, is useful when communicating between tasks on different processor cores in a multicore environment. For those who may be unfamiliar with the concept of callback functions, I will have a discussion on these when we look at multi-core systems (in a later article) and these become useful.
Sending is similar to that for queues, with arguments of a handle, a buffer pointer, a size in bytes (new from queues) and a tick to wait before timing out. Receiving calls have the same arguments as sending calls. The number of bytes written or read is returned by the API functions. Again, as for queues, there are ISR versions.
The buffers can be inspected for the number of bytes it contains and the space remaining, as well as whether full or empty. It can also be reset back to its initial empty state or deleted completely.
Inter-task Synchronisation
In the last section, we looked at how we can send data between tasks. Another basic feature required is to be able to synchronise between tasks. We do this with semaphores and mutexes. These two are very much related and we will look at these together in this section.
Semaphores
It might be helpful if we have a look at what a semaphore actually is. Obviously, the original meaning was the use of flags to enable non-verbal communication over a distance, particularly between ships, spelling out letters with two flags held in different positions relative to the sender’s body. (This is not strictly relevant, but I am originally from Portsmouth, the home of the Royal Navy.) In the context of an RTOS, it is a means of blocking and unblocking tasks based on the value of the semaphore. In that sense, they are like flags indicating whether a task can advance or not. They come in two flavours:
The binary semaphore is, as the name indicates, is an on/off, true/false, 1/0 kind of arrangement, and is ostensibly used to synchronise tasks, allowing one task to be halted until another task has reached some point, either complete the construction data, or some event has happened etc. A counting semaphore builds on this with a count value, where it can be incremented and decremented and (usually) will block a task when it is zero. The counting semaphores are used to synchronise some access to a shared resource, which has multiple, but finite, capacity. For example, suppose we have a DMA block that has two channels which are shared between multiple tasks (more than 2). It is okay if one channel is in use and a task requires some DMA operation as it can use the free channel, but if a third task requires DMA services it must block until a channel becomes free. If a counting semaphore is created with an initial value of 2 and is decremented when a task requests use of a DMA channel, then the first two tasks will continue, which will decrement the semaphore to 0, making the third task block when it makes a request. If the semaphore is incremented when a DMA has completed, then the third task will be unblocked when the semaphore is incremented to 1, though it will immediately decrement it back to 0 as it takes the channel, blocking any other task that comes along to use the DMA engine, and so on. In may respects, a binary semaphore works like a counting semaphore with an initial value of 1.
A key feature of all semaphores is that changing of its state (either 1/0 or incrementing/decrementing) is ‘atomic’. That is, happens as a single uninterruptable operation. This is critical for it to work, and hardware often provides facilities to help with this as it’s so important. For example, the RISC-V provides the “A” atomic extensions to allow atomic read-modify-writes between harts (hardware threads) sharing the same memory, and the AXI bus provides optional “exclusive” bus transaction signalling to aide in the implementation of semaphores. These kinds of features may not be necessary to implement semaphores in a system, depending on its architecture, so long as the semaphore state can be updated atomically by some means. This is a lot easier in single core solutions.
FreeRTOS provides facilities to create and use semaphores. Before we look at these, just a quick word on nomenclature. In most of my dealings with semaphores the terms ‘post’ and ‘wait’ have been used for the operation of incrementing a semaphore, and for decrementing a semaphore or blocking if it’s at 0. In FreeRTOS, the terms ‘give’ and ‘take’ are used but mean exactly the same thing.
Some API highlights are listed below for the FreeRTOS semaphore functionality:
Creating a binary semaphore is a simple matter of calling the API function xSemaphoreCreateBinary, which takes no arguments, and returns the handle. For counting semaphores, a maximum count and an initial count are passed in as arguments. Both functions have static equivalents, which require an additional ?buffer pointer argument. A semaphore can be deleted with vSemaphoreDelete, passing in the handle of the surplus-to-requirements semaphore.
The ‘take’ functions have arguments for the semaphore handle and a tick timeout count, and return pdTRUE if semaphore obtained or pdFALSE if it timed out. Like the other API functions, the ISR versions replace the time out value with pointer to return whether a higher priority task was unblocked by obtaining the semaphore. The ‘give’ equivalent API functions simply take the semaphore handle as an argument, with the ISR equivalent having the same addition argument as the take ISR function.
Finally, uxSemaphoreGetCount returns the current count of a semaphore. In the case of binary semaphores this is 1 if the semaphore is available, or 0 if not.
Mutexes
A mutex (from mutually exclusive) is a special kind of semaphore used to mark a ‘critical’ section of code. By this I mean a piece of code that is shared by more than one task, but that must not be interrupted when entered. This might be because it has multiple instructions to perform an operation that would fail if a higher priority task preempted the task running this critical section. For example, suppose a peripheral can be set up to send some data, but this takes multiple steps to configure, send and acknowledge, and so this must always start from some idle type state, returning to idle when the operation is completed. If a task were part way through this operation when it was preempted by a higher priority task, say, then the new task might configure this incorrectly, since state has already been advanced by the preempted task. But even if it reset the device to idle and completed its operation, when the preempted task was resumed, the state of the device would be wrong for the point where it was suspended, and thus the operation will almost certainly fail.
So why don’t we just use a semaphore? Well, there is a problem to solve called ‘priority inversion’. In the scenario above, when the higher reaches the critical section and tries to take the semaphore already taken by the lower priority task, it will block until the lower priority task gives it back—this is the behaviour that we want. The lower priority task will effectively block a higher priority task, but only in the shared critical section. Thus, critical sections need to be kept as short as possible. However, if a third task, with some middling ?priority between the lower and higher priority tasks, becomes active whilst the lower priority task is in its critical section, it would preempt the lower priority task. If this middling task never goes through this critical section, it will never block on it, and could run for a long time, blocking the higher priority task artificially long—this is the behaviour that we don’t want.
The solution is to use mutexes for critical sections, which add ‘priority inheritance’ over ordinary semaphores. When a task successfully takes a mutex, its priority is temporarily raised to be higher than all other tasks, returning to normal when the mutex is given back. This ensures that critical sections are exited in all haste, and avoids other tasks, not associated with the critical section, starving higher priority tasks, blocked on the mutex. It is not a bullet proof solution, and if large amounts of code are encompassed by mutexes, higher priority tasks can still be starved beyond design intention. Careful coding is required.
Highlights of the API calls relating to FreeRTOS mutexes are listed below:
The first two listed API functions take exactly the same arguments as for their semaphore counterparts. In addition, there are a pair of mutex creation functions with Recursive in their name. These allow a task that has already successfully taken the mutex to retake it any number of times. Only when it gives it back the same number of times it has taken it, is the mutex freed for other tasks. I have been racking my brains to think of a use case for using recursive mutexes, but couldn’t think of one, and I have never used them myself (let me know if you have any use case examples). The recursive mutexes have their own give and take functions (and the API functions mustn’t be mixed with the wrong type of mutex) but work the same way as the non-recursive functions. For non-recursive mutexes, the semaphore give and take API functions, mentioned before, are shared with mutexes. Lastly, the xSeamphoreGetMutexHolder function returns the handle of the task that currently holds a mutex (with the mutex handle passed in as the argument).
Conclusions
In the first article was introduced the concept of an RTOS which then focused on creation and manipulation of free running, independent tasks. In this article, the concepts of inter-task communication and synchronisation were introduced.
We looked at binary semaphores to synchronise tasks, and counting semaphores to protect shared resources with multiple, but finite, capacity. Mutexes were described as special forms of semaphore for encapsulating ‘critical’ sections of code that mustn’t be interrupted. These are needed over ordinary semaphores to solve the issue of ‘priority inversion’ by using ‘priority inheritance’.
So, we leave this article with the ability of our tasks to exchange information and to synchronise with each other under various conditions. In the next article I want to finish up the main features of an RTOS, looking at the remaining functionality of the FreeRTOS kernel, then look at running on multiple processor cores, and finish up with a discuss how FreeRTOS is configured with reference to the porting of the OS to the RISC-V instruction set simulator.
For those that can’t wait until the next article to get the ISS running FreeRTOS, you can check out the?riscV repository?from github, and the?freertos?directory has a?makefile?to build a very simple demonstration program. A?README.md?file gives more details on the required prerequisites etc. You will, of course, also have to build the RISC-V ISS, in the iss directory, which can be done with?visual studio?on Windows, or using the provided?makefile.lnx?on Linux.