Quality RTOS & Embedded Software

 Real time embedded FreeRTOS RSS feed 
Quick Start Supported MCUs PDF Books Trace Tools Ecosystem


Loading

pxCurrentTCB corrupted

Posted by Steven J. Ackerman on December 15, 2012
Working with FreeRTOS v7.1 and Rx62N port.

My application would run for several hours or up to a day then execute the BRK exception handler. I have removed and verified that there are no BRK instructions in my application.

Upon investigating after a invalid BRK exception, I found that somehow the pxCurrentTCB variable was being set to 2 instead of the address of a valid TCB. When this bogus pointer value was used to switch contexts the USP would point to an area of memory that was zeros - which, when excuted is the BRK instruction. All of the tasks stacks and interrupt stack appear to have plenty of room, and I have stack checking enabled.

Using my Renesas E1 emulator I set a event breakpoint for a write to the pxCurrentTCB address with a value of 2. The event captured this bogus write at the end of vTaskSwitchContext() called from within vSoftwareInterruptISR(), prvYieldHandler() at the MVTIPL #1 instruction.

253 void vSoftwareInterruptISR( void )
254 {
255 prvYieldHandler();
FFFE89B0 7FA8 _vSoftwa SETPSW I
FFFE89B2 7EAF PUSH.L R15
FFFE89B4 FD6A2F MVFC USP,R15
FFFE89B7 60CF SUB #0CH,R15
FFFE89B9 FD68F2 MVTC R15,USP
FFFE89BC E00F MOV.L [R0],[R15]
FFFE89BE E50F0101 MOV.L 04H[R0],04H[R15]
FFFE89C2 E50F0202 MOV.L 08H[R0],08H[R15]
FFFE89C6 62C0 ADD #0CH,R0
FFFE89C8 7FA9 SETPSW U
FFFE89CA 6E1E PUSHM R1-R14
FFFE89CC FD6A3F MVFC FPSW,R15
FFFE89CF 7EAF PUSH.L R15
FFFE89D1 FD1F0F MVFACHI R15
FFFE89D4 7EAF PUSH.L R15
FFFE89D6 FD1F2F MVFACMI R15
FFFE89D9 6D0F SHLL #16,R15
FFFE89DB 7EAF PUSH.L R15
FFFE89DD FBF2740E0100 MOV.L #00010E74H,R15
FFFE89E3 ECFF MOV.L [R15],R15
FFFE89E5 E3F0 MOV.L R0,[R15]
FFFE89E7 757004 MVTIPL #4H
FFFE89EA 05BCC7FF BSR.A _vTaskSwitchContext
FFFE89EE 757001 MVTIPL #1H <-- this is where the pxCurrentTCB=2 event occurred.
FFFE89F1 FBF2740E0100 MOV.L #00010E74H,R15
FFFE89F7 ECFF MOV.L [R15],R15
FFFE89F9 ECF0 MOV.L [R15],R0
FFFE89FB 7EBF POP R15
FFFE89FD FD171F MVTACLO R15
FFFE8A00 7EBF POP R15
FFFE8A02 FD170F MVTACHI R15
FFFE8A05 7EBF POP R15
FFFE8A07 FD68F3 MVTC R15,FPSW
FFFE8A0A 6F1F POPM R1-R15
FFFE8A0C 7F95 RTE
FFFE8A0E 03 NOP
FFFE8A0F 03 NOP
FFFE8A10 02 RTS
256 }


It looks like perhaps the listGET_OWNER_OF_NEXT_ENTRY(pxTCP, &(pxReadyTasksLists [uxTopReadyPriority]) is returning an incorrect value ? The value of uxTopReadyPriority = 2 and the contents of pxReadyTasksLists[] is:

ADDRESS LABEL +0 ASCII
0001046C _pxReadyTasksLi 00000001 ....
00010470 000079A4 .y..
00010474 FFFFFFFF ....
00010478 000079A4 .y..
0001047C 000079A4 .y..
00010480 00000003 ....
00010484 0000B2FC ....
00010488 FFFFFFFF ....
0001048C 0000980C ....
00010490 0000A55C \...
00010494 00000001 ....
00010498 0001051C ....
0001049C FFFFFFFF ....
000104A0 0000ACF4 ....
000104A4 0000ACF4 ....
000104A8 00000000 ....
000104AC 000104B0 ....
000104B0 FFFFFFFF ....
000104B4 000104B0 ....
000104B8 000104B0 ....

Any ideas on how to track this down ? I've been trying to debug this for several weeks.

Steven J. Ackerman, Consultant
ACS, Sarasota, Florida
http://www.acscontrol.com

RE: pxCurrentTCB corrupted

Posted by Richard on December 15, 2012
Here are some general comments:

+ You mention that the stacks seem to have enough space. Do you also have the stack overflow protection switched on? Be aware that the stack overflow checking only checks the task stack, not the interrupt stack.

+ Are you 100% sure that all interrupts that make use of FreeRTOS API functions have a priority at or below configMAX_SYSCALL_INTERRUPT_PRIORITY (lots of people get this point wrong on Cortex-M3 chips, not surprisingly as the settings are complex, on those, it is much simpler on the RX though). This type of corruption is a symptom of an incorrect value.

+ Have you checked against all the notes on the following page: http://www.freertos.org/FAQHelp.html

+ In your code snipped you note that pxCurrentTCB gets corrupted after the call to vTaskSwitchContext () [is my interpretation of your code correct?]. That is unlikely because nothing is being written to memory at that point - unless the corruption is occurring from a nested interrupt. What is much more likely is that the corruption is happening inside the vTaskSwitchContext() function - because that is where pxCurrentTCB is. If you could catch it in there that may provide more useful information. What would be even more useful would be to catch the corruption in the list you reference when it happens, rather than when the corrupt data is used.

+ Are you able to selectively remove pieces of functionality from your code in an attempt to isolate the cause (sometimes that can help, but sometimes changing the execution patter just moves the problem somewhere else).

Regards.

RE: pxCurrentTCB corrupted

Posted by Steven J. Ackerman on December 15, 2012
Richard-

First, thank you for your reply.

I have stack overflow protection switched on. I can see the A5(s) in each task's stack - there is plenty of room in each.

I initialize the interrupt stack in my reset handler - before any C section initialization and main() is called. I've looked at the interrupt stack and there is plenty of room there as well.

I will verify that I'm not calling any FreeRTOS API functions from any interrupt handlers above configMAX_SYSCALL_INTERUPT_PRIORITY. This is a possiblity as I do have some interrupts that run at higher priority and there may be an execution path.

I have read and just re-read the FAQHelp again.

The pxCurrentTCB appears to get written to 2 at the line I indicated - at least that is where the emulator stopped execution for the data access write=2 @ pxCurrentTCB event. I'm not sure if the access is pipelined or what delay the emulator introduces - I agree that the MVTIPL #1 instruction should not be the instruction that is corrupting the pxCurrentTCB - it is probably in the list handling at the end of vTaskSwitchContext(). I will try to establish a combined event of execution address and data write access to see if I can better establish the place where the write of 2 is occurring.

Desperate to solve this as it is keeping me from selling this product.

Regards,

Steven J. Ackerman, Consultant
ACS, Sarasota, Florida
http://www.acscontrol.com

RE: pxCurrentTCB corrupted

Posted by Richard Damon on December 15, 2012
The stopping point may well be delayed due to pipelining in the processor, so if one of the last things done in vTaskSwitchContext is writing the new value, then returning, the return likely starts before the write starts, and thus is allowed to complete (landing you where you did) before the breakpoint hits.

One thing that I can think of that can cause this sort of thing is if you have a TickIdleHook routine that blocks. This can cause the system to not have ANY task to run and then you get into a crash like this.

RE: pxCurrentTCB corrupted

Posted by Steven J. Ackerman on December 15, 2012
Richard-

Just so I'm clear on this - this approach to handling an interrupt from a source that is higher priority than configMAX_SYSCALL_INTERRUPT_PRIORITY would be OK ? The RTC interrupt has a priority of 5 and occurs once/second. I call xTaskResumeFromISR() from the interrupt handler to wakeup the Timer1S task which reads the current time and queues a message into an event handler queue:

/////////////////////////////////////////////////////////////////////////////
// local variables
static uint32_t time_counter32bit;
static EVENT_MSG msg;
static tm time;
static void (*RTC_Callback)(void);

/////////////////////////////////////////////////////////////////////////////
// rtc interrupt handler
#pragma interrupt Interrupt_RTC(vect=VECT_RTC_PRD)
void Interrupt_RTC(void)
{
/* Call the user function iff defined */
if (RTC_Callback != NULL)
{
RTC_Callback();
}
}

/////////////////////////////////////////////////////////////////////////////
// initialize the RTC
void RTC_Init(void (* callback)(void))
{
rtcStop();

/* periodic interrupt 1Hz */
RTC.RCR1.BIT.PES = 0x6u;

/* reset RTC */
RTC.RCR2.BIT.RESET = 1;

/* setup the RTC periodic interrupt callback */
RTC_Callback = callback;

/* Disable interrupt requests */
ICU.IER[IER_RTC_ALM].BIT.IEN_RTC_ALM = 0;
ICU.IER[IER_RTC_PRD].BIT.IEN_RTC_PRD = 0;
ICU.IER[IER_RTC_CUP].BIT.IEN_RTC_CUP = 0;

/* Enable RTC periodic interrupt requests */
RTC.RCR1.BIT.PIE = 1;

/* Enable RTC carry interrupt requests (so we can read it in RTC_GetTime()) */

/* initialize periodic interrupt priority */
ICU.IPR[IPR_RTC_PRD].BIT.IPR = RTC_INTERRUPT_PRIORITY;

/* Enable periodic interrupt requests */
ICU.IER[IER_RTC_PRD].BIT.IEN_RTC_PRD = 1;

rtcStart();
}

/* called from RTC interrupt */
void rtc_interrupt_callback(void)
{
(void)xTaskResumeFromISR(Timer1SHandle);
}

/* task created by TimerRTC_init */
static void Timer1S(void *pvParameters)
{
EVENT_MSG msg;

msg.source = MS_DISPLAY;
msg.event = ME_TIME_UPDATE;

RTC_Init(rtc_interrupt_callback);

for (;;)
{
// awoken by xTaskResumeFromISR() in rtc_interrupt_callback() above
vTaskSuspend(NULL);

// a 32 bit counter that is updated every tick ( 1 second)
time_counter32bit++;

RTC_GetTime(&time);

timeUpdated = true;

/* notify event manager */
msg.param.value = time_counter32bit;
(void)xQueueSend(EventQueue, (void *)&msg, 0UL); /* RTOS_USAGE */
}
}

static void Timer1S(void *pvParameters); // The task of Event Manager
xTaskHandle Timer1SHandle; /* RTOS_USAGE */

void TimerRTC_init(void)
{
/* create the timer read task */
(void)xTaskCreate(Timer1S, "t:Time", configMINIMAL_STACK_SIZE, NULL, mainDEMOTASKS_PRIORITY + 1, &Timer1SHandle); /* RTOS_USAGE */

time.tm_sec = 00;
time.tm_min = 45;
time.tm_hour = 16;
time.tm_am_pm = T_PM;
time.tm_wday = TUE;
time.tm_mday = 26;
time.tm_mon = JUN;
time.tm_year = 2012;

RTC_SetTime(&time);

timeUpdated = false;
}


Steven J. Ackerman, Consultant
ACS, Sarasota, Florida
http://www.acscontrol.com

RE: pxCurrentTCB corrupted

Posted by Richard on December 15, 2012
Firstly, I hate the function xTaskResumeFromISR(). I added it because people requested it, but it can be dangerous if interrupts come in faster than the task that gets resumed can execute from the resume point back to the suspend point (using a semaphore take/give latches events so doesn't miss them, whereas task suspend/resumes don't latch events). I don't think this is the issue in this case though if the interrupt is only coming in once a second it should not be a problem.

“The RTC interrupt has a priority of 5 and occurs once/second.”


So the interrupt is calling an API function - you didn't mention what configMAX_SYSCALL_INTERRUPT_PRIORITY is set to though. If it is 5 or below you should be ok on that point.

From the code it looks like you are using the Renesas compiler. Which version?

Regards.

RE: pxCurrentTCB corrupted

Posted by Steven J. Ackerman on December 15, 2012
Richard-

#define configMAX_SYSCALL_INTERRUPT_PRIORITY 4// 1 (lowest) - 15 (highest)

so I guess that this is the problem... fantastic! I guess that I got confused by the _FromISR() being allowed from an interrupt and forgot to check the interrupt priority. The low rate of occurence is probably due to the fact that this interrupt and API call only occur once/second.

I'm going to review all of these priorities, make changes and run another test.

I'm running:
C/C++ compiler package for RX family V.1.02 Release 01 (Update Utility)(3-27-2012 09:14:28)

Regards,

RE: pxCurrentTCB corrupted

Posted by Steven J. Ackerman on December 17, 2012
Richard-

This has indeed fixed my problem. My apologies for not catching this whilst perusing the manuals and re-reading the FAQ. My kudos for a great project and your continued excellent support.

Also, thanks for the heads-up on the semaphore vs task suspend/resume.

I hope that you enjoy a great holiday season !

Steven J. Ackerman, Consultant
ACS, Sarasota, Florida
http://www.acscontrol.com



[ Back to the top ]    [ About FreeRTOS ]    [ Privacy ]    [ Sitemap ]    [ ]


Copyright (C) Amazon Web Services, Inc. or its affiliates. All rights reserved.

Latest News

NXP tweet showing LPC5500 (ARMv8-M Cortex-M33) running FreeRTOS.

Meet Richard Barry and learn about running FreeRTOS on RISC-V at FOSDEM 2019

Version 10.1.1 of the FreeRTOS kernel is available for immediate download. MIT licensed.

View a recording of the "OTA Update Security and Reliability" webinar, presented by TI and AWS.


Careers

FreeRTOS and other embedded software careers at AWS.



FreeRTOS Partners

ARM Connected RTOS partner for all ARM microcontroller cores

Espressif ESP32

IAR Partner

Microchip Premier RTOS Partner

RTOS partner of NXP for all NXP ARM microcontrollers

Renesas

STMicro RTOS partner supporting ARM7, ARM Cortex-M3, ARM Cortex-M4 and ARM Cortex-M0

Texas Instruments MCU Developer Network RTOS partner for ARM and MSP430 microcontrollers

OpenRTOS and SafeRTOS

Xilinx Microblaze and Zynq partner