Next: , Previous: Supported systems, Up: Top


Appendix H Porting

This section describes how to port the mpatrol library to new systems. It is not a complete set of guidelines as nothing can cover every eventuality, but it should list most of the important issues and where to make the necessary changes. Once you've made the changes (and are happy with them) then send them to me and I can incorporate them into the next mpatrol release. I'd also like to hear from anybody who has got mpatrol working on a different version of an operating system listed in the supported systems section (see Supported systems) even if no changes were required, since that information can be useful for new users wondering if mpatrol can be used on their system.

  1. Make any required changes in src/target.h in order to identify the new system.

    The TARGET macro is used to identify distinct families of operating systems whereas the SYSTEM macro is used to identify the operating system variant if TARGET=TARGET_UNIX. You should try to identify the predefined preprocessor macros that the system C compiler defines for the operating system type and the operating system variant, otherwise you will have to specify the TARGET and SYSTEM macros explicitly in the Makefile when building the mpatrol library. Note that for non-UNIX operating systems, SYSTEM=SYSTEM_ANY is implied.

    The ARCH macro is used to identify the processor architecture and the ENVIRON macro is used to identify the processor word size. Again, you should try to identify the predefined preprocessor macros that the system C compiler defines for the processor architecture and processor word size, otherwise you may also have to specify the ARCH and ENVIRON macros explicitly in the Makefile when building the mpatrol library. The default setting for the processor word size is ENVIRON=ENVIRON_32.

    You can normally figure out the preprocessor macros that are predefined by the system C compiler by using the -#, -v or -verbose options when compiling a source file. The command line used to invoke the preprocessor should then be shown, which should show a list of all of the macros that are being defined in addition to those specified on the compiler command line. It should then be easy for you to spot the ones you need.

    The FORMAT macro is used to identify the object file format and the DYNLINK macro is used to identify the dynamic linker type. You may be able to use the existing values for these without having to define new ones, but in any case you should attempt to set defaults for these macros depending on the values of the four preceding macros. A setting of FORMAT=FORMAT_NONE indicates that reading symbols from any object files is not supported and a setting of DYNLINK=DYNLINK_NONE indicates that reading symbols from shared libraries is not supported.

    If the object file format of the new system is not currently supported, perhaps it is supported by the GNU BFD library. This can be used as a catch-all solution to provide symbol reading support for the mpatrol library with object file formats that are obscure or are just hard to implement readers for. You'd be surprised at how many object file formats are supported by that library and if the new format is supported then try defining FORMAT=FORMAT_BFD for the new system.

    In all six of the above target macros, care should be taken not to define a new macro that is effectively the same as an existing one, unless there are significant differences. For example, the dynamic linker used on BSD systems is slightly different from the dynamic linker used on SunOS, but they both use DYNLINK=DYNLINK_BSD because the underlying dynamic linker uses the same data structures — they are just named differently on the two systems.

    Note that there are also corresponding *_STR macros for all six of the above target macros. These are used when displaying the target environment information in the mpatrol log file so they should be as accurate as possible so as to avoid misleading users.

    Finally, you should determine if it is necessary to define any special macros in order to obtain all of the required definitions from the system header files. Many compilers default to providing an ANSI C or C++ environment without any extensions, but as the mpatrol library uses additional features that are not provided by these standards, it may be necessary to define additional macros that allow the compiler to see the definitions of these features. For example, the _POSIX_SOURCE macro is defined here for all UNIX platforms so that mpatrol can make use of the POSIX extensions. Note that src/target.h is the only mpatrol library source file that refers to the predefined preprocessor macros defined by the system C compiler on a particular system (apart from a few necessary exceptions) and the rest of the source code refers to the six aforementioned macros for conditional compilation.

  2. Make any required changes in src/memory.c in order to support the new system.

    The mpatrol library, like the system malloc library it is replacing, must have some way of allocating memory from the system heap for a process. For UNIX systems, this is done by calling sbrk() and/or mmap() but this is likely to be completely different for other operating systems. The mpatrol library must also have some way of returning the allocated heap memory back to the operating system, although on systems with virtual memory this is not really an issue (see MP_DELETEHEAP in src/config.h). If there is currently no support in the mpatrol library for allocating and returning system heap memory for the new system then you must modify __mp_memalloc() and __mp_memfree() to add the support. You should define MP_MMAP_SUPPORT in src/config.h if the operating system is UNIX and the system variant supports the mmap() system call.

    Note that some (mainly embedded) systems may have no system heap available for a program to use. If that is the case then the mpatrol library can be built to allocate memory from a static array whose size is fixed at compile-time. The MP_ARRAY_SUPPORT macro should be defined in src/config.h and the MP_ARRAY_SIZE macro should be set to the maximum number of bytes that the simulated heap should be able to hold. Keep in mind that all of the internal mpatrol library data structures will also be allocated from this array so it is important to make it large enough.

    Operating systems with virtual memory allow mpatrol to protect certain regions of heap memory to ensure that they are not overwritten. The MP_PROTECT_SUPPORT macro in src/config.h controls whether the operating system supports this, and the __mp_memprotect() and __mp_memquery() functions should be updated to support the new system. You should also define MP_MINCORE_SUPPORT in src/config.h if the operating system is UNIX and the system variant supports the mincore() system call. The MP_WATCH_SUPPORT macro controls the support of software watchpoints in a similar way and the __mp_memwatch() function should be updated if they are supported.

    If the new system is a UNIX system and it supports the /proc filesystem then you may wish to define MP_PROCFS_SUPPORT in src/config.h. However, this is only necessary if there is a way to detect the filename the current process was invoked with (MP_PROCFS_CMDNAME) or a way to obtain the filehandle of the executable file for the current process (MP_PROCFS_EXENAME). It may also be necessary if MP_WATCH_SUPPORT is defined and the only way to set the watchpoints is via a file in the /proc filesystem (MP_PROCFS_CTLNAME).

    Finally, you should add support for determining the system page size in pagesize() and the process identifier for the current process in __mp_processid() if the system is not already supported1. You will also have to add a way to determine the filename that the current process was invoked with in progname(), otherwise the PROGFILE option will always have to be used in order to read symbols from the executable file. This can be done in a multitude of ways, including examining global variables, making function calls to query the system or traversing the call stack.

  3. Make any required changes in src/stack.c in order to support stack traversal in the new processor architecture.

    If the new processor architecture is CISC (complex instruction set computer) then the chances are that you can easily find the frame pointer and return address of the current stack frame by simply looking at a constant offset from the parameter to the __mp_getframe() function. The call chain can then be obtained by following the frame pointer at each stage. This can sometimes be disrupted by optimisations that do not preserve the frame pointer but this is usually confined to leaf routines and is not normally an issue. The Intel x86 and Motorola 680x0 processor families are good examples to look at when implementing stack traversal for a CISC processor.

    On the other hand, things might not be so easy if the new processor architecture is RISC (reduced instruction set computer). Such processors do not always have fixed format stack frames2 and so other means might have to be used. The Alpha and MIPS processor families are examples of these and code reading normally has to be used in order to find the call instruction from the calling routine. This then has to be done for every function in the call stack. An example of such code can be found for the generic MIPS implementation. Any assembler code that needs to be written to support the stack traversal implementation should be written in src/machine.c.

    If the GNU compiler is being used then it might be possible to use its __builtin_frame_address() and __builtin_return_address() builtin functions in order to provide stack traversal. These can only be used if they return `NULL' when the bottom of the call stack is reached, but on many architectures the GNU compiler does not implement this correctly and so this method of stack traversal cannot be used. Even if it can, it still imposes an upper limit on the size of the stack that can be traversed. If this is not an issue then it can be enabled with the MP_BUILTINSTACK_SUPPORT macro in src/config.h and the maximum size of the call stack that can be traversed can be set by changing the MP_MAXSTACK macro in the same file. The MP_FULLSTACK macro in src/config.h should be set for stack traversal implementations that have no limit to the maximum size of the call stack that can be traversed. Obviously that is not the case for MP_BUILTINSTACK_SUPPORT. A similar method can be used to traverse the stack using the backtrace() function from glibc with the MP_GLIBCBACKTRACE_SUPPORT preprocessor macro.

    Some operating systems have library functions that provide stack traversal facilities and so you may wish to make use of them by defining MP_LIBRARYSTACK_SUPPORT in src/config.h and implementing the code to call them in src/stack.c. Examples of systems that can make use of this capability are IRIX and Tru64, although they have a drawback in that they recursively call malloc() and so work slower than they normally would. Alternatively, if libunwind has support for the processor architecture then you can try defining MP_LIBUNWIND_SUPPORT in src/config.h to see if that works.

    If any functions from an external system library were used to help implement stack traversal for the new processor architecture then you may also have to modify the MP_SYSTEM_LIBS definitions in src/config.h, the __mp_lib* definitions in src/inter.c and the AC_CHECK_LIB() calls in extra/mpatrol.m4.

  4. Make any required changes in src/symbol.c in order to support any new object file formats and dynamic linkers.

    The best place to find information on the object file format and dynamic linker interface supported by a new system is the on-line manual pages and header files on that system. If that fails then try the hardcopy technical reference manuals that came with the system or the internet in order to find the information you need. There may also be standards that define the object file format and dynamic linker interface across several systems.

    If you defined a new FORMAT macro in src/target.h then you must add the code to support it in src/symbol.c. You will typically have to add new addsymbol() and addsymbols() functions that are specific to the new object file format and then add support for that format in __mp_addsymbols() and __mp_findsymbol(). If it is possible to easily read a line number table from the object file format then you may also want to extend the __mp_findsource() function to handle the new format as well in order to support the USEDEBUG option.

    If you defined a new DYNLINK macro in src/target.h then you must also add the code to support it in src/symbol.c. You will normally only have to extend the __mp_addextsymbols() function to support the new dynamic linker but there may be some extra work required to translate the base addresses of any symbols read from shared libraries into real addresses.

    In both cases, try to base the new code on the structure of the existing code since it has been proven to work well and there is no point in reinventing the wheel3. You might decide to make changes to an existing implementation instead; this was done with the COFF and XCOFF formats, for example.

    If any functions from an external object file access library were used to help read symbols from the new object file format then you may also have to modify the MP_SYMBOL_LIBS definitions in src/config.h, the __mp_lib* definitions in src/inter.c and the AC_CHECK_LIB() calls in extra/mpatrol.m4.

  5. Make any required changes in src/signals.c in order to obtain the address of an illegal memory access in the new system.

    If the system supports the SA_SIGINFO flag when setting up a signal handler with sigaction() then it supports architecture-independent determination of the address of an illegal memory access and the MP_SIGINFO_SUPPORT macro should be set in src/config.h.

    If this is not the case then an architecture-dependent method must be employed in order to obtain this information. On UNIX systems, signal handlers can have additional arguments that may be used to probe for the address of a segmentation violation or bus error. On Windows systems, an exception record can be obtained whenever an access violation occurs. In either case, the saved register containing the relevant address must be determined. If this is not done then the mpatrol library will compile correctly, but the addresses of illegal memory accesses can never be determined.

  6. Make any required changes in src/mutex.c in order to support threads in the new system.

    The mpatrol library must be able to lock its data structures in a multithreaded environment otherwise two threads may allocate memory at the same time and the heap would become corrupted, for example. On operating systems that have virtual memory, processes have their own address space and can have more than one thread of execution running at one time. On other operating systems, there is only one process (the operating system) and the threads are the user processes that all share the same address space. For that reason, you may wish to use semaphores on such systems since they have no support for threads in a conventional sense.

    For systems that do support threads, mutexes should be used to lock the mpatrol library data structures. On UNIX platforms, POSIX threads are used but this could easily be extended to other threads implementations. On Windows platforms, Win32 API threads are used. For other systems, POSIX threads are preferred but it should not be too hard to add support for others. There should also be a way to return the current thread identifier.

    You should also determine if it is necessary to define any special macros in order to obtain all of the required threadsafe definitions from the system header files. Many compilers require an option to be specified on the command line in order to compile threadsafe code, but some still only require a preprocessor macro to be defined during compilation. For example, the _REENTRANT macro is defined for Solaris systems so that mpatrol can make use of the threadsafe definitions. Any such macros should be defined in src/config.h when MP_THREADS_SUPPORT is defined.

    The multithreaded version of the mpatrol library must be initialised before a process becomes multithreaded and so there must be a way to do this on a new system.

    The MP_INIT_SUPPORT macro should be defined in src/config.h if the new system supports `.init' and `.fini' sections that get executed before and after main() respectively. Both the contents of the `.init' section (which should call __mp_initmutexes() and __mp_init()) and the `.fini' section (which should call __mp_fini()) should be written in src/machine.c in assembler code.

    There are also other methods to initialise and terminate the mpatrol library in src/inter.c so you may need to use one of them (or add a new method of your own) for the new system. Note that if MP_USE_ATEXIT is defined in src/config.h then these methods of terminating the mpatrol library when a process ends are replaced by registering the __mp_fini() function with atexit().

    There may be problems if the mpatrol library is built to override malloc() and related functions if the system C library calls them before the mpatrol library can be initialised. There is a function in src/inter.c on UNIX and Windows platforms called crt_initialised() which checks to see if it is safe to initialise the mpatrol library, and if not the relevant functions will use sbrk() to allocate the memory. You may have to modify crt_initialised() to support the new system if there are initialisation problems.

    If there are no special methods to initialise the multithreaded version of the mpatrol library on a new system then it will simply be initialised at the first call to one of its functions, hopefully before the process has become multithreaded.

    If there is support for reading symbols from object files on the new system then you should compile and run the following test with the mpatrol library to check to see if there is support for calling functions by their start address. This is not always true on certain systems and will most likely result in the test crashing if that is the case. If the test works then the MP_INITFUNC_SUPPORT macro should be set in src/config.h.

              #include <stdio.h>
              #include "mpatrol.h"
              
              
              void __mp_init_test(void)
              {
                  puts("__mp_init_* functions work");
              }
              
              
              void __mp_fini_test(void)
              {
                  puts("__mp_fini_* functions work");
              }
              
              
              int main(void)
              {
                  malloc(1);
                  puts("there should be a line of output above and below");
                  return EXIT_SUCCESS;
              }
         

    If any functions from an external threads library were used to lock the data structures of the multithreaded version of the mpatrol library then you may also have to modify the MP_THREADS_LIBS definitions in src/config.h, the __mp_lib* definitions in src/inter.c and the AC_CHECK_LIB() calls in extra/mpatrol.m4.

  7. Make any required changes to src/diag.c in order to support the new system.

    If the directory separation characters used by filesystem pathnames on the new system are different to those already supported then you must modify processfile(), __mp_logfile(), __mp_proffile() and __mp_tracefile() in order to support them. The mpatrol library needs to know how to extract and join the directory and filename components in a pathname in order to support the special characters that may appear in the filenames specified in the LOGFILE, PROFFILE and TRACEFILE options.

  8. Make any required changes to src/version.c in order to support the new system.

    Different operating systems have different ways of embedding version information into libraries. For example, on AmigaOS the version command looks for the `$VER:' string in a binary file and displays any information following it. If the new system uses a special format for embedding version information then an alternative definition for __mp_version should be added to src/version.c. It might also be useful to make any necessary changes to the mupdate shell script in the bin directory in order to support the new format, although that is not strictly required as it is only used when building automated mpatrol releases.

    The RCS revision string of each mpatrol source file can also be embedded into the mpatrol library and its tools. The way this is done is controlled by the MP_IDENT_SUPPORT macro in src/config.h. If it is set then the system supports placing these strings in a special section in the object file via the #ident directive, otherwise the strings will be placed in a data section in the object file.

  9. Make any required changes in src/mpatrol.c in order to support executing external commands.

    The mpatrol command should be modified to support the execution of external commands on a new operating system. The exec() family of functions are used on UNIX platforms, while the spawn() family of functions are used on Windows platforms. The ANSI C system() function is currently used on all other platforms, but that runs the command indirectly via the system command line interpreter (shell) which is not usually very efficient. You may also have to add the ability to find any commands using a search path.

    If the new operating system can support the --dynamic option of the mpatrol command then the MP_PRELOAD_SUPPORT macro should be defined in src/config.h. The name of the environment variable that must be used to specify the list of shared libraries to preload should be given in MP_PRELOAD_NAME and the library separator string for the list should be given in MP_PRELOAD_SEP. The MP_LIBNAME macro may also need to be modified if the naming convention of shared libraries is different on the new system. Note that the __mp_editfile() function in src/diag.c may also need to be modified to prevent editor processes from being affected by the --dynamic option.

  10. Make any required changes in src/mptrace.c in order to support any new window systems.

    The mptrace command may be built as a text-only command line tool, or it may be built with GUI support if the MP_GUI_SUPPORT macro is defined in src/config.h. If it is built with GUI support and the --gui option is specified then it becomes an event-driven tool and the code in src/mptrace.c has been written to reflect that. The mptrace command currently only has Motif GUI support but if you wish to add support for a new window system then it shouldn't be too hard to do. Note that you will probably have to add additional libraries to the Makefile when building mptrace with MP_GUI_SUPPORT defined.

  11. Make any required changes to the shell scripts in the bin directory.

    The mpsym, mpedit and hexwords commands all require UNIX systems, or UNIX tools, to run. If the new system has the ability to run these commands then you should check that they run as expected. If not, you should make the necessary modifications to make them work, although it should be in a generic fashion as there are no checks for specific platforms or processors in these files. You may also wish to add support for other debuggers in mpsym and other editors in mpedit.

  12. Add a new subdirectory to the build directory if a new operating system is being supported.

    A new Makefile should be added in the new subdirectory along with any extra system-specific files that might be needed to build the mpatrol library on the new system. The new Makefile should be based upon one of the existing Makefiles in the other subdirectories but should obviously differ in the platform-dependent areas. You may wish to add more than one Makefile to support different types of compilers on the new operating system.

    You must also decide which object files should get built into the mpatrol library. If it is not safe to override the system malloc() routines on the new system then you should not include src/malloc.c, and the same goes for src/cplus.c and the C++ operators. If there is no sbrk() function provided on the new operating system then you should include src/sbrk.c if you need to call sbrk() in src/inter.c.

    If the new operating system uses a special archive or package format then you should add support for it by adding a new subdirectory to the pkg directory. A build script should be added to the new subdirectory that will automatically build the archive or package file from scratch. Include any additional files that you need to perform the build in the new subdirectory as well.


Footnotes

[1] You will also have to make any changes to pagesize() in src/mpalloc.c and possibly also have to define MP_MEMALIGN_SUPPORT in src/config.h if the new system supports the memalign() function.

[2] Although some do, and you can follow the instructions for CISC processors above in order to provide stack traversal support for them.

[3] You might also be interested to note that you can safely call malloc() in this code to allocate memory — just remember to clean up after yourself!