Debugging memory leaks & buffer overflows in FreeRTOS

Chris Hockuba
6 min readDec 29, 2018

RTOS (Real Time Operating System)

Every embedded engineer will eventually face a situation when it’s simply more convenient to use RTOS instead of bare-metal when working on a project. Since IoT applications become increasingly popular recently, we will focus on this particular topic. Within a typical IoT app one will have a task collecting data from sensors, potentially the same or other task will process the data and lastly, the data will be periodically sent over Ethernet, Wi-Fi, GSM or other means of communication. Using real-time operating-system can be a great help. FreeRTOS is one of the most popular solutions since it is well known, documented and often distributed with middleware from silicon vendors.

With an RTOS one needs to anticipate task stack requirement at worst case scenario. This may be an issue in a system where a limited amount of memory is available. There are certain situations when one needs to perform some operation which is memory costly, but only once. One may use a static buffer but will waste precious RAM, since 99% of the time this buffer won’t be used. Dynamic memory allocation is the obvious solution here and while in this particular scenario, unlikely one will forget about freeing the memory region. There are many cases when one may make such a mistake. This is especially true when there is a system heavily dependent on making network requests and parsing responses. In this case, buffer overflow becomes a common issue.

One still needs to keep in mind that dynamic memory allocation should be avoided on embedded systems, some of the reasons being: memory leaks, memory fragmentation, slowness of memory allocation and difficulty of debugging.

In computer science, a memory leak is a type of resource leak that occurs when a computer program incorrectly manages memory allocations in such a way that memory which is no longer needed is not released. A memory leak may also happen when an object is stored in memory but cannot be accessed by the running code.

source: https://en.wikipedia.org/wiki/Memory_leak

Problem

Having a complex system makes it difficult to track all allocated areas and pinpoint problematic threads. In order to identify if there are memory leaks one can look at two variables which are conveniently shared by FreeRTOS: xFreeBytesRemaining and xMinimumEverFreeBytesRemaining (accessed using xPortGetFreeHeapSize() or xPortGetMinimumEverFreeHeapSize() respectively). First one will tell how many remaining bytes are currently available in an RTOS heap. The second will give an all-time low, this variable is an indicator if there are any memory leaks somewhere. If this variable doesn’t settle around certain value and consequently goes down to 0 during periodic operations, there are surely some memory leaks. Buffer overflows are a bit harder to identify. In some cases it may be obvious and easy to debug, but in others, a buffer overflow may lead to a devastating result that will manifest after some time rather than straight away.

Solution

There is an easy way of tracking allocated areas and pinpointing “orphaned” buffers in an application. One knows that in order to free buffer it is necessary to know its location in the memory. Therefore after any allocation, a memory address for an allocated buffer has to be present and referenced as a pointer. Buffer overflow, on the other hand, is easier to detect using canary values at the beginning of allocated memory region and at the end. This is the best solution in our case as it has the least influence on performance.

FreeRTOS has different memory allocation schemes, this example will be based upon heap4 memory allocation scheme.

This scheme uses a first fit algorithm and, unlike scheme 2, it does combine adjacent free memory blocks into a single large block (it does include a coalescence algorithm).

This implementation:

Can be used even when the application repeatedly deletes tasks, queues, semaphores, mutexes, etc..

Is much less likely than the heap_2 implementation to result in a heap space that is badly fragmented into multiple small blocks — even when the memory being allocated and freed is of random size.

Is not deterministic — but is much more efficient that most standard C library malloc implementations.

source: https://www.freertos.org/a00111.html

When one allocates a block of memory it can be seen that allocated memory region contains BlockLink_t structure at the beginning, and next an allocated buffer. Mentioned structure looks like so:

Original BlockLink_t structure within allocated memory region.

One may modify the structure in order to keep track, which thread allocated a buffer. This will help later in scanning memory. Next, one may add start and end canary which will be used to check buffer integrity. Lastly, it is necessary to keep track of allocated memory regions, this may be a challenging task when there is a lot of allocated memory regions. One has to create an array of pointers pointing to already allocatedBlockLink_t structures, for example BlockLink_t* allocList[256] = {0}; . The array will store all pointers to allocated memory regions until they are freed. Two functions are needed that will add a new pointer from the array and remove an old one.

The first function will simply add a pointer in a first free place of an alloc list array (vPortAddToList). The second one will first cycle through an array in order to find pointer that should be removed and remove it, finally, it will shift all the objects in an array so that there are no gaps between them(vPortRmFromList).

With a list of allocated buffers, one needs a simple routine that will cycle through the list and check if each of the saved pointers is referenced in the memory. Since BlockLink_t structure stores owner of a buffer, one may scan stack region of a thread in order to look for a reference to an allocated pointer. In some cases, pointer reference might be stored in a global structure or variable, which is why it is also recommended to scan this region. While this method is not foolproof it helped me in finding a couple of leaks that I would probably never find otherwise.

Routine will cycle through all allocated areas and check each one of them if they are referenced somewhere in the memory. In most cases, a pointer will be referenced inside a task. Therefore the first scan will be carried out only on thread stack. Only if it’s not possible to find a reference, the routine will continue to scan specified memory regions by the structure tMemoryRegions. In this example two regions are defined starting at 0x10000000 and 0x20000000 in this case, both regions are simply RAM1 and RAM2 of used MCU. When ram regions are scanned function will ignore the area of a currently executed task since the current task will store the address on its stack this function will always return true if the check is not performed. In the end, offending memory address will be printed to stdout with printf pointing to a problematic address in memory, size of allocated area and task which allocated the buffer. Function ideally should be executed in some low priority thread which runs from time to time.

Allocated memory region after changes.

Lastly, it’s necessary to create a function scanning allocated memory regions integrity for potential buffer overflows. It’s quite simple, for each memory region one checks for start canary stored in BlockLink_t structure, secondly, one calculates the end of the allocated area and checks the last 4 bytes to match end canary. This way it’s easy to identify offending buffers which overflow their allocated region. It’s also necessary to change vPortMalloc function so that it allocates 4 more bytes for end canary, and fills necessary buffer space.

The snippet below is example implementation containing all the necessary changes (this is not whole heap4.c file).

Or here is a diff along with optional header file:

--

--

Chris Hockuba

Embedded Software Engineer, Entrepreneur, FPV Drone enthusiast and Hardware hobbyist.