On systems with virtual memory, such as UNIX and Windows, user programs are run
as processes which have their own address space and resources. If a
process needs to create sub-processes to perform other tasks it must call
spawn() to create new processes, but these new processes
do not share the same address space or resources as the parent process. If
processes need to share memory they must either use a message passing interface
or explicitly mark a range of memory as shareable.
Traditionally, this was not too much of a handicap as parallel processing was an expensive luxury and could only be made use of by the kernel of such systems. However, with the birth of fast processors and parallel programming, programs could be made to run more efficiently and faster on multi-processor systems by having more than one thread of control. This was achieved by allowing processes to have more than one program counter through which the processor could execute instructions, and if one thread of control stalled for a particular reason then another could continue without stalling the entire process.
Such multithreaded programs allow parallel programming and implicit shared memory between threads since all threads in a process share the same address space and resources. This is similar to operating systems that have no virtual memory, such as AmigaOS and Netware1, except that once a process terminates, all threads terminate as well and all of its resources are still reclaimed.
Multithreaded programming generally needs no compiler support, but does require some primitive operations to be supported by the operating system for a threads library to call. The functions that are available in the threads library provide the means for a process to create and destroy threads. There are currently several popular threads libraries available, although the POSIX threads standard remains the definitive implementation.
It is always important to remember when programming a multithreaded application that because all threads in a process share the same address space, measures must be taken to prevent threads reading and writing global data in a haphazard fashion. This can either be done by locking with semaphores and mutexes, or can be performed by using stack variables instead of global variables since every thread has its own local stack. Care must be taken to write re-entrant functions — i.e. a function will give exactly the same result with one thread as it will with multiple threads running it at the same time.
The mpatrol library can be built as a thread-safe library with support for
multi-threaded programs. When this library is linked with your program, only
one thread at a time can allocate, reallocate or deallocate dynamic memory, or
perform a memory operation via
memset(), etc. This does
not take full advantage of the potential concurrency in the library, but at
least it will allow the debugging of multi-threaded programs.
The process of making the mpatrol library thread-safe was made more complicated
by the fact that the mutexes protecting the library's data structures had to be
recursive, since some of the functions that the library will call may call
free() or any other functions redefined by the
library. If this was to happen with non-recursive mutexes then the recursive
call would result in the thread attempting to lock a mutex that it already
owned. However, implementing recursive mutexes was only half the problem.
The other problem with writing a thread-safe malloc library is that it must be
initialised before the program becomes multi-threaded. If the library is
initialised when there are multiple threads running then one thread may be
attempting to initialise the mutexes whilst another thread may be attempting
to lock an uninitialised mutex. Ideally, the best place to initialise the
library would be at the start of
main() but there is currently no way
to do this other than getting users to explicitly plant calls to initialise the
library in their code. This is not a very satisfactory solution if all we want
to do is link in the replacement malloc library without any need for
Fortunately, there are some ways to plant initialisation calls before
main() is called, but they all have some drawbacks. The first way is to
use a static file-scope constructor in C++, which will then initialise the
mutexes and the library data structures before the code in
executed. However, on many systems this will require the final link to be
performed by the C++ compiler that built the library. That may not be desirable
or even possible in many cases. Unfortunately, this drawback appears in the
second method, which involves using the GNU C compiler to compile the library.
This compiler has an extension which allows functions to be specified as
constructors which will be called before
main(), but means that any
program which is linked with the resulting library must be linked with the GNU C
compiler driver. However, many systems are now GNU-based which would mean that
this would happen anyway.
The final way of initialising the mutexes and the library data structures is to
plant a call to the initialisation routines from a special section which the
system will call before
main() is called. This section is called the
`.init' section on ELF-based platforms, but may exist in another form on
other platforms too. This has the advantage that the system linker can be used
to link the final program, but a possible disadvantage is that the library may
be initialised too early, possibly before the environment or file streams have
been set up. You may find that if one of the above methods does not work for
you then perhaps another one will.
 Where the kernel is effectively a single process running all user programs as threads.