Next: , Previous: Improving performance, Up: Top


15 How it works

The mpatrol library was originally written with the intention of plugging it into an existing compiler so that the compiler could plant calls to it in the code it generated when a specific debugging option was used. These extra calls would obviously slow the code down, but along with the stack checking options that would be provided, this would give the user an enhanced run-time debugging environment. Unfortunately, this integration never happened, but the way that mpatrol works is still significantly different from other malloc tracing libraries.

In order to quickly determine exactly which memory allocation a heap address belonged to it was necessary to be able to search the heap in an efficient manner. The traditional way of searching along a linked list was unfeasible, so an implementation based on red-black trees was used, where every known memory allocation in the heap was given an entry in the tree, with their start addresses as the key. Another major design decision was to also choose red-black trees to implement the best fit allocation algorithm. Although first fit was considered, I decided that best fit would allow the library to have more control over the heap, with every free memory block in the heap given an entry in the free tree, with their sizes as the key. There was a bit of work involved in getting the splitting and merging of free blocks to work efficiently, but it seems to work well now.

My original implementation had all of the information about each memory block stored just before the block itself. I eventually dropped that behaviour in favour of storing all of the library's internal information in a separate part of the heap. I did that for two reasons. The first was because of the problems that would occur due to memory allocations with different alignment requirements. The second reason was that the library's internal structures could be write-protected on systems with virtual memory, to prevent user code interfering with the operation of the library.

Because the library attempts to record as much information as possible about every memory allocation there will inevitably be a much larger memory requirement when running a program linked with the library. This will typically be two or three times larger in magnitude, but will be affected by the number of memory allocations made and also the number of symbols read. The latter will also affect how quickly the program starts since the first call to allocate memory will result in the initialisation of the library and the loading of symbols from the executable file and any shared libraries.

Due to its design, it is also possible to allocate memory from the heap using the mpatrol library functions whilst already within an mpatrol library function. This does not normally occur, but on some platforms calling printf() from within the library may result in printf() calling malloc() to allocate itself a buffer, which ends up as a recursive call. Luckily, this is dealt with by simply not displaying the allocation in the log file, but all other details of the allocation are still recorded. This can sometimes result in hidden memory usage which occurs behind the scenes and alters the peak memory usage in the summary. This is particularly evident when the library uses an object file access library to read program symbols at the time of library initialisation.

Memory allocation profiling support was added for mpatrol release 1.2.0. Every allocation and deallocation is recorded, with the call stack information being used to differentiate all of the call sites within the program. Unlike other profilers that come with UNIX systems, even the symbolic information about the program being run is written to the profiling output file, since it makes no sense for mprof to re-read the symbol table from the executable file when it has already been read and processed by the mpatrol library. It also has the added bonus of allowing the user to save profiling output files for later use even when the executable files which produced them have changed or no longer exist. It also means that symbol names can be obtained for functions in shared libraries.

Memory allocation tracing support was added for mpatrol release 1.3.2 and was added to produce concise information for every memory allocation event. This information could also be produced in a verbose form in the log file, but to log every memory allocation event in a large program would result in a massive log file that would be hard to parse. In order to keep the size of the tracing output file down, almost all of the data in the file is encoded as LEB128 numbers. The idea for this comes from the DWARF 2 debugging format.

Support for the alloca() family of functions was added for mpatrol release 1.3.0 and uses the heap instead of the stack in order to trace and debug these functions. If full call stack tracebacks are supported on a particular system then mpatrol will compare the current call stack with the call stack of the function that called alloca() in order to determine if a memory allocation made by alloca() is out of scope. This is generally a safe way to determine when such allocations should be freed, but if full call stack tracebacks are not supported then mpatrol will compare the addresses of specific local variables in the call stack in order to determine if the allocation should be freed. This is an inferior method since it depends on the same function call sequence being used each time an mpatrol function is called. Therefore, a safety boundary was added that will prevent mpatrol from freeing such allocations unless they are a really clear-cut case (i.e. the stack frames differ by a minimum number of bytes). As a result, this second method will not usually free such allocations until a much later point.

The library is written in a modular fashion so as to make it easy to add new functionality. New modules have already been added, such as the stack, symbol, profile and trace modules. Extra information about each memory allocation can be added to the allocation information module in src/info.h and src/info.c without having to change much code in any other files.

The tools directory in the mpatrol distribution comes with a collection of functions that are built on top of the mpatrol library using its interface functions. This provides a way to extend the mpatrol library for specific applications without requiring that all applications use the extensions. It also provides a way to add new interfaces to the library, perhaps for compatibility with other malloc debugging libraries.

Platform-dependent code has been isolated to specific modules, and feature macros are entirely defined and controlled from config.h and target.h. The source code has been written so as to make it as easy as possible to compile the library on new platforms at the first attempt, although any additional features that the platform supports will then have to be explicitly enabled in the code.

Of the UNIX platforms that the mpatrol library runs on, Solaris and Linux proved to be the easiest to port to, with well documented and easily accessible programming interfaces to operating system features. Unfortunately, the non-UNIX ports proved a lot harder to write and do not contain as many of the useful features that the UNIX ports have, although sometimes not because they cannot ever support them, but because there would be a huge amount of work involved.