[RTOS Support] possible bug in software timers

Quality RTOS & Embedded Software

About Contact Support FAQ Download

Menu

Quick Start

Supported MCUs

PDF Books

Trace Tools

Ecosystem

Quick Start Guide

Support Forum

⇓ Download Source ⇓

FreeRTOS+ Ecosystem

FreeRTOS+TCP:
Thread safe TCP/IP stack

SafeRTOS:
TUV certified RTOS

OpenRTOS:
Commercial Licensed RTOS

Fail Safe File System:
Ensures data integrity

FreeRTOS BSPs:
3rd party driver packages

Trace & Visualisation:
Tracealyzer for FreeRTOS

CLI:
Command line interface

WolfSSL SSL / TLS:
Networking security protocols

RTOS Training:
Delivered online or on-site

IO:
read(), write(), ioctl() interface

FreeRTOS+ Lab Projects

FreeRTOS+POSIX:
POSIX threading API

FreeRTOS+FAT:
Thread aware file system

FreeRTOS Support Archive

The FreeRTOS support forum is used to obtain active support directly from Real Time Engineers Ltd. In return for using our top quality software and services for free, we request you play fair and do your bit to help others too! Sign up to receive notifications of new support topics then help where you can.

This is a read only archive of threads posted to the FreeRTOS support forum. The archive is updated every week, so will not always contain the very latest posts. Use these archive pages to search previous posts. Use the Live FreeRTOS Forum link to reply to a post, or start a new support thread.

[FreeRTOS Home] [Live FreeRTOS Forum] [FAQ] [Archive Top] [November 2012 Threads]

possible bug in software timers

Posted by Greg Renda on November 16, 2012

I'm using version FreeRTOS 7.0.0 but the code in question seems to be the
same in 7.3.0. The problem is in prvProcessReceivedCommands().
Specifically this at the beginning of the function:


    xTimeNow = prvSampleTimeNow( &xTimerListsWereSwitched );

    while( xQueueReceive( xTimerQueue, &xMessage, tmrNO_DELAY ) != pdFAIL )
    {

The issue I'm seeing is that messages are being retrieved from xTimerQueue
with commandTimes > xTimeNow. This case is interpreted by code in
prvInsertTimerInActiveList() as indictating that the tick count has
overflowed. In which case the timer is considered expired, the callback is
called and the timer is restarted if it's an autoReload timer. That's all
well and good but, in my case, the tick count has not overflowed. What has
happened is that xTimerQueue was being filled by another task at the same
priority as prvTimerTask(). The other task was able to add a timer start
command with commandTime > xTimeNow. The timer was immediately considered
expired and, as this was an autoReload timer, re-queued, causing the while
loop to never exit since xTimeNow is never recalculated. This situation
does not occur if prvTimerTask is a higher priority than the task starting
the timers. Another solution would be to move the call to
prvSampleTimeNow() inside the while loop. The solution I went with was to
change the while into an if.

So, have I interpreted all of this correctly? If so, aside from changing
priority levels, which is the preferred fix? Thanks.

RE: possible bug in software timers

Posted by MEdwards on November 16, 2012

I'm sure Richard will commont on this, but I don't understand the scenario you are describing.

“The other task was able to add a timer start
command with commandTime > xTimeNow. The timer was immediately considered
expired and, as this was an autoReload timer, re-queued, causing the while
loop to never exit since xTimeNow is never recalculated.”

If that is the case, why is the timer being immediately considered expired?

If commandTime == xTimeNow or if commandTime == (xTimeNow +1) then you can imaging the timer task would rightly attempt to continuously process the timer because you have told it it is expiring continuously, but I don't think that is what you are reporting.

RE: possible bug in software timers

Posted by Greg Renda on November 16, 2012

In the tmrCOMMAND_START case prvInsertTimerInActiveList() is called and returns true when ( xTimeNow < xCommandTime ) && ( xNextExpiryTime >= xCommandTime ). This causes the callback to be called and the timer to be re-queued with the current xTaskGetTickCount() value which is now pretty much guaranteed to be > xTimeNow, and around you go.

The problem is that within the while loop xTimeNow is frozen at the value it got before the loop was entered. Other tasks at the same or higher priority as the timer task can add timer starts to the queue with commandTimes == xTaskGetTickCount which is moving along while xTimeNow is frozen as long as there are messages in the queue.

RE: possible bug in software timers

Posted by Richard on November 16, 2012

I know there is a note in the description of configTIMER_TASK_PRIORITY saying you have to be careful of the priority assigned to the task relative to other tasks, but need to understand exactly what you are saying before I can comment properly.

An example with sample values might help, for example, this is how I am interpreting description at the moment:

1)
You say

““messages are being retrieved from xTimerQueue with commandTimes > xTimeNow” ”

and relate this to the start timer commend. So a task calls xTimerStart() on a timer that is defined to expire at a time greater than the time at which the function is called. So with sample numbers lets say the time when the message is posted (the CommandTime) is 10, and timer period is 5, so the expiry time is 15.

2) The command is tmrCOMMAND_START, so prvProcessReceivedCommands() calls prvInsertTimerInActiveList()

3) You say

““This case is interpreted by code in prvInsertTimerInActiveList() as indictating that the tick count has overflowed””

Which is the case in prvInsertTimerInActiveList() that uses the test…

““( xTimeNow < xCommandTime ) && ( xNextExpiryTime >= xCommandTime )””

…as per your later post.

Using sample numbers again, to show the correct case - to make this test past, lets assume that xTimeNow (the time when prvProcessReceivedCommands() was entered) is 4 so an overflow has genuinely occurred [obviously the sample numbers are nonsense unless a 4-bit variable was being used].

4) You say

“in my case, the tick count has not overflowed”

Can you provide sample numbers, in the way I have above, that shows the case where the test…

“( xTimeNow < xCommandTime ) && ( xNextExpiryTime >= xCommandTime )”

…evaluates to TRUE when the tick count has not overflowed, which I think you are saying is happening to cause the anomalous behaviour you report.

These cases were statically and dynamically tested rigorously, but it is obviously possible that with certain temporal dynamics at differing task priorities, a case slipped through untested.

I would be extremely careful about turning the while() into an if() – it might be ok but I can’t be sure. I would at least have to run the code through the test cases to see if they picked up anything.

Regards.

RE: possible bug in software timers

Posted by Greg Renda on November 16, 2012

Okay, let me try it this way. Consider two tasks, a main task creating and
starting timers and the timer task, servicing the timer requests, both
running at the same priority level.

Main task looks like:

t1=xTimerCreate() period=200
xTimerStart(t1) commandTime=5 expiry=205

--- main task blocks, timer task runs ---

t2=xTimerCreate() period=200
xTimerStart(t2) commandTime=15 expiry=215

timer task looks like:

xTimeNow = prvSampleTimeNow() returns xTaskGetTickCount = 10

while (xQueueReceive(...))
{
First time through the loop
t1 is retrieved
commandTime = 5
xTimeNow = 10
expiry = 205
everything is fine xTimeNow > commandTime

--- timer task blocks, main task runs ---

Second time though the loop
t2 is retrieved
commandTime = 15
xTimeNow = 10 frozen since we're still in the loop
expiry = 215

whoops, xTimeNow < commandTime and expiry > commandTime
timer tick considered to have overflowed. But it hasn't.
Then t2 is re-queued from within this loop with a commandTime of
commandTime + timerPeriod, guaranteeing that we'll stay in the
loop since we've added to the queue and guaranteeing that
we'll see xTimeNow < commandTime again since xTimeNow will still be
frozen.
}

Thanks for looking into this with me.

RE: possible bug in software timers

Posted by Richard on November 16, 2012

Yes, thank you, now I understand clearly what you are reporting.

I will run through the very example you have in a simulator, where I can punch in the times manually to fit the case to see if there is an easy solution.

Assuming this is a genuine issue, which at the moment it looks like it is (but I need to exporte more with a clear head, I’ve been travelling for a few days) then you are right that raising the timer task priority is a work around – but a proper fix is needed.

Looking at the code I realise why:

xTimeNow = prvSampleTimeNow( &xTimerListsWereSwitched );

is called outside of the loop, not inside which might be more obvious. prvSampleTimeNow() will potentially switch the timer lists from the list currently being processed to the list that contains tasks whose timeout is after the tick count has overflowed. Calling xTimeNow = prvSampleTimeNow() on each loop, I think at least, runs the risk of the lists being switched when the list being switched away from is not empty. If that happens a timer that has expired, and should be processed before the lists are switched, will remain unprocessed until the tick count value has looped all the way back around – which could be weeks.

There may therefore be one possible solution whereby any timers that remain in the list have assumed to expired, and are processed – but that needs more thought.

Thank you for taking the time to report, and for your patients in explaining three times before I understood.

Regards.

RE: possible bug in software timers

Posted by Greg Renda on November 16, 2012

Glad I could finally make the problem clear for you. As to possible solutions, it seems to me that turning the while into an if is functionally okay but slightly inefficient if there are multiple items in the queue. I didn't look too closely at moving prvSampleTimeNow() into the loop but I think that's okay, too, since if the lists are switched, the timers on the current list are all processed by prvSwitchTimerLists() so there should be no orphaned timers. But it's certainly possible there are other side effects I haven't considered. I leave it in your capable hands.

RE: possible bug in software timers

Posted by Richard on November 18, 2012

Moving the call to prvSampleTimeNow() inside the while() look does indeed make a test case fail, as I recalled it did. Investigating this however shows that this is only because of an assumption the test case pre-condition set up makes.

prvSampleTimeNow() contains a static variable, and the test case calls prvProcessReceivedCommands() to set the static variable to the correct value for the test case pre-condition. Moving prvSampleTimeNow() inside the while loop prevents it being called when there are no items in the timer queue, and so prevents the static variable being set to the pre-condition value. Changing the test code to call prvSampleTimeNow() directly in the pre-condition set up, rather than indirectly through prvProcessReceivedCommands() allows the test case to pass. The code I have now is therefore as below:

static voidprvProcessReceivedCommands( void )
{
xTIMER_MESSAGE xMessage;
xTIMER *pxTimer;
portBASE_TYPE xTimerListsWereSwitched, xResult;
portTickType xTimeNow;

while( xQueueReceive( xTimerQueue, &xMessage, tmrNO_DELAY ) != pdFAIL )
{
pxTimer = xMessage.pxTimer;

if( listIS_CONTAINED_WITHIN( NULL, &( pxTimer->xTimerListItem ) ) == pdFALSE )
{
/* The timer is in a list, remove it. */
uxListRemove( &( pxTimer->xTimerListItem ) );
}

traceTIMER_COMMAND_RECEIVED( pxTimer, xMessage.xMessageID, xMessage.xMessageValue );

/* In this case the xTimerListsWereSwitched parameter is not used, but 
it must be present in the function call.  prvSampleTimeNow() must be 
called after the message is received from xTimerQueue so there is no 
possibility of a higher priority task adding a message to the message
queue with a time that is ahead of the timer daemon task (because it
pre-empted the timer daemon task after the xTimeNow value was set). */
xTimeNow = prvSampleTimeNow( &xTimerListsWereSwitched );

switch( xMessage.xMessageID )
{