Ever since the selection of Mach2.5 as the basis of the OSF/1 operating system, OSF intended to base its OS developments on the Mach3.0 microkernel, which provides a scalable, extensible, OS-neutral set of abstractions. The OSF Research Institute has made significant improvements and extensions to the original CMU Mach3.0 microkernel, and the result, named OSF MK, is still available for free. The latest versions of OSF/1 are based on OSF MK but are encumbered by commercial licenses. We decided to produce an unencumbered UNIX-like server on top of OSF MK, providing our members and the research community with a fully unencumbered development environment for their microkernel developments.
In this paper, we first describe the functionalities that were added to OSF MK by the OSF Research Institute and some performance improvements. We describe the architecture of the Linux server, emphasizing the areas requiring interaction between the Linux code and the microkernel. We present some performance figures for the Intel x86 platform and introduce the port of OSF MK and the Linux server on the Apple PowerMac platform.
OSF MK and the Linux server are or will shortly be available for free from the OSF Research Institute for both the Intel x86 and Apple PowerMac platforms.
In addition to the advanced features already present in the Mach microkernel, such as SMP support, network transparent IPC and support for application specific paging policies, the promise of a microkernel architecture was very appealing. OSF's customers have interests in many different operating systems. We needed a way to pursue a program of operating systems research that was not unduly operating system specific. Operating system specific behavior needed to be separated from OS independent functionality. Our hope was that a microkernel based architecture would prove to be more portable and modular than the monolithic systems that were prevalent at the time.
We are interested in promoting the use of our systems technology within the industrial and research communities. The OSF MK kernel is freely available but the OSF/1 server is encumbered by commercial licenses including the SVR2 Unix license. For many organizations this is not a problem because they already have a SVR2 license but for others the licenses have proved to be serious obstacles. In an effort to remove these obstacles we decided to produce a free UNIX-like server that would suit the needs of Mach developers.
2. OSF MK, the OSF microkernel
Our original interest in Mach was due to the powerful abstractions provided by the kernel, its operating system neutrality and the promise of greater portability and modularity. To a large extent, the microkernel has lived up to our expectations. OSF and some of our customers have ported the kernel to several platforms without undue difficulty. We and our collaborators have hosted different operating system personalities on top of the kernel. And for the last few years we have had an active research program that exploits and extends the abstractions provided by the microkernel.
The kernel itself is somewhat complex, yet the task of porting it to a new hardware platform is fairly straightforward. We support our version of the microkernel on several different hardware platforms. These include the Intel x86 family, the Intel i860, the DEC Alpha, the HP PA-RISC and the Apple/IBM/Motorola PowerPC. The microkernel has a clean separation of hardware dependent and hardware independent functionality. Writing the hardware dependent code for the microkernel usually requires approximately 4 to 6 months. The difficulty in creating an adequate suite of device drivers can easily exceed the effort necessary to port the rest of the kernel. In some cases the effort can be reduced by converting pre-existing drivers. We will discuss the effort to port the kernel to the PowerMac later in the paper.
In addition to different versions of Unix, MacOS, and MS/DOS have been hosted on Mach. IBM has hosted OS/2 on their own version of the Mach microkernel. Early efforts to layer OS personality servers on top of the microkernel have had disappointing performance due to the extra message-based communication between the system components. Often, as much as a 40% performance cost has been reported.
Thread migration was derived from work done on Mach4.0 at the University of Utah . It aims at reducing the cost of switching context between the sender and the receiver of an RPC.
The ``thread'' abstraction has been split into two new entities:
In the current implementation, it also requires that the sender and the receiver are collocated (see below), i.e. that they are in the same address space.
Modularity is preserved because all interactions between the kernel and the server are still done through MIG (Mach Interface Generator) interfaces at the source level, but the generated code can dynamically detect if the sender and the receiver are in the same address space and avoid unnecessary data copying.
Collocating a server as a kernel task should not be done lightly because an incorrect or malicious collocated server could corrupt the kernel. But once a server has been debugged in user-mode, it can be loaded into the microkernel for best performance.
The collocatable servers of OSF MK are similar in spirit to the ``system actors'' found in Chorus , another microkernel based system developed originally at INRIA.
Used together, thread migration, short-circuited RPC and collocation can almost reduce an RPC to a simple procedure call. The kernel to server system call exception RPC and the server to kernel system calls benefit from this optimization, greatly improving the overall system performance.
A Real-Time microkernel
In addition to its portability and its operating system neutrality, we were interested in the microkernel as a foundation for research into real-time and distributed computing issues. Mach was not designed as a real-time operating system. The original design focus was for scalable, multiprocessing timesharing systems. Extensive use of lazy evaluation techniques were used in the design of the VM and scheduling subsystems. In order for OSF MK to be a suitable foundation for real-time application we had to make enhancements to the microkernel ranging from the prosaic, such as pre-emption, clocks and alarms to the innovative, such as real-time RPC, the CORDS framework for network protocols and ``paths''.
Before the microkernel can be suitable for real-time applications it must provide reasonable, predictable behavior. Complex real-time operating systems, like the various real-time Unix systems and the OSF microkernel all use pre-emption to avoid indeterminate event latencies because there are certain features that, though desirable, are inherently unpredictable. Our pre-emption strategy exploits the fine-grained locks already in the kernel to provide Symmetric Multi-Processing (SMP) support. This naturally led to a fully preemptible system .
Mach 3.0 had simple and complex locks . Simple locks provided mutual exclusion and complex locks provided multiple reader, single writer semantics. In OSF MK, simple locks were enhanced to prevent pre-emption. This resulted in a working system but because of the original code's extensive use of simple locks the resulting system had unacceptable event latencies. To deal with this problem we added a new type of lock: a mutex lock. A mutex lock is an inexpensive, mutual exclusion blocking lock. The difference between a simple lock and a mutex lock is that a kernel thread can be preempted while holding a mutex. Most of the algorithms that used simple locks were converted to the new mutex locks.
In the initial version of the system it was possible to have unwarranted context switching between timesharing threads due to pre-emption. This problem was corrected by a simple modification to the pre-emption code. Preemption only occurred if the higher priority thread was a fixed priority thread. With this change, the cost of enabling pre-emption in an SMP environment was negligible when measured by a standard benchmark like AIM III. In a uni-processor environment, the cost of pre-emption is identical to the cost of enabling SMP locks, i.e., approximately 10%. Since our preemption mechanisms are integrated so closely with the SMP locking mechanisms this is not surprising.
Kernel pre-emption created a new problem - a type of scheduling anomaly sometimes referred to as a priority inversion . Priority inversions can occur when a high priority thread becomes dependent on or blocked by a lower priority, preempted thread. We designed a straightforward priority boosting protocol inside the kernel to deal with priority inversions. Priority boosting propagates across dependencies, not just locks. If a thread blocks and becomes dependent on another thread, then the thread controlling the dependency is boosted. If the boosted thread is blocked by another dependency then the boosting propagates down the dependency chain. A thread remains boosted until it releases its last dependency.
This algorithm is not perfect in that some threads remain boosted longer than absolutely necessary. But it is very simple and inexpensive.
The Real-Time RPC  is not layered on top of message based IPC. Implementing RT-RPC as a new kernel service had important advantages:
RPC specific optimizations can be made along the entire RPC path (our RPC is twice as fast as the Mach3.0 RPC optimizations). Real-time RPC specific behaviors, such as alerts, orphan detection, predictable delivery and nested time constraint propagation are possible. An efficient, unified programming model for invoking operations across module boundaries within a task, across the task/kernel boundary or across task boundaries is possible.
Sometimes it is important to signal or generate an exception at the head of an RPC chain rather than a thread somewhere in the middle. One reason for doing this could be the elapsing of a deadline specified by one of the threads in the chain. Alerts are the mechanism used, by either the kernel or an application, to generate a timely exception at the head of an RPC chain.
Node failures or other events such as task or thread termination can result in broken chains. Detecting and eliminating the orphaned chain fragment in a timely fashion is important in real-time systems. Responding to whatever failure or event caused the chain to break is as important as responding to any other external event. In some systems, timely response to failures is more important than the processing of ordinary events.
Characterization Tools (ETAP)
ETAP (Event Trace Analysis Package)  is a tool for characterizing the performance and behavior of real-time applications as well as the system software. ETAP is straightforward in design. The kernel reserves a block of memory as a circular message buffer. The size of the buffer is configurable. The kernel has been instrumented with a variety of probes. When activated these probes create entries in the circular buffer. Probe entries contain a type field, a time-stamp, a thread ID tuple, and probe specific information. Probes can be used to capture a wide range of information such as context switching events, system calls, lock events, device events, etc. There are global and per thread probes. Any subset of the probes can be dynamically activated or deactivated. Applications with probes write to the buffer. There is a second task that reads the buffer and records it on disk where the information can be subsequently analyzed using different report generation programs. When configured into the kernel, inactive probes incur an insignificant overhead. Approximately 1 percent when running AIM III.
The thread ID tuple identifies the thread and its shuttle or RPC chain. The RPC chain identifier allows us to determine which client thread a server is acting for. This is an invaluable tool for tracking the causal dependency of events in a client/server system. It has also become a valuable tool for debugging the kernel and applications.
In addition to the work already described we have made a variety of relatively small changes and additions to the microkernel. These include:
Networking with CORDS
The support for networks in Mach3.0 was limited to the packet filter. Protocols and network transparent IPC were expected to be implemented as user space servers. This had a negative effect on the performance of most uses of the network. The architecture also presented significant obstacles to a correct implementation of network IPC. In OSF MK we have added an object oriented framework for network protocols, Communication Objects for Reliable, Distributed Systems (CORDS) . CORDS is derived from the xkernel, developed by the University of Arizona .
The CORDS framework has many features to simplify the task of implementing network protocols. Complex protocols can be decomposed into a graph of ``micro-protocols''. A protocol graph can be extended across protection boundaries permitting portions of a protocol graph to exist in a task while other parts exist in the kernel.
Multi-Computers and Clusters
DIPC (Distributed IPC) and XMM (eXtended Memory Management)  provide transparent internode communication and shared memory on NORMA (No Remote Memory Access) architectures.
DIPC extends Mach IPC in a way that permits applications running on any node to view Mach abstractions such as tasks, threads, memory objects and ports in a transparent way. XMM supports distributed shared memory.
With all these extensions, the microkernel has grown to become unacceptably large. We want the microkernel to run on low-end machines and we also want to target embedded systems, so we embarked on a program to make most of the microkernel's features configurable. The target is a minimal microkernel that could run on a compute-only node and would only perform basic scheduling and IPC.
We are developing or planning other projects on OSF MK, which are less relevant to the Linux server project because they are not integrated in the mainline microkernel or not freely available. These projects include:
The range of PowerMacintosh machines use various different PowerPC processors, together with very different machine architectures, ranging from a board architecture very close to that of the 68000-based Macs through to the latest CHRP machines. We initially targeted the PowerMac 8100/80 machines which contain the PowerPC 601 microprocessor and a board architecture similar to that in the 68000 Macs.
To illustrate the clean separation between machine-dependent and machine-independent code, we note that in porting OSF MK to the PowerMac, the modifications to the generic parts of OSF MK consisted of two minor source file additions to preprocessor options.
The compiler tool chain chosen was GCC 2.7.1, which supports the cross-compilation of PowerPC code using the elf binary format. We also used a crossed version of GDB for remote debugging via a serial line, allowing symbolic debugging of the kernel from an extremely early stage.
Once the tool chain is in place, work can begin on porting the kernel. Initially, the only machine-dependent requirements are a means of I/O, traditionally done via a serial port connected to the development machine. Below is a list of the steps taken in the kernel port:
Once the minimal kernel functionality is complete, there remains the issue of device drivers. Device drivers are more platform dependent than they are processor dependent, and on many platforms device drivers may be reused or easily adapted from those written for previous ports. This was the case for both the serial line driver and for the SCSI controller on the PowerMacs. Writing a small stub of PowerMac-dependent DMA code meant that the original drivers could be used with little modification.
In any first implementation of an operating system, the trade-off of simplicity and debuggability is made against those of performance and functionality.
4. Linux Server Architecture
A Single Server
Our Linux server is a ``single server'', meaning that the entire Linux functionality resides in one single Mach task. The alternative to this design is the ``multi-server'' design, where functionality is split between smaller specialized tasks communicating though Mach RPC.
A Multi-Threaded Server
A server on top of Mach simply receives and replies to requests from user tasks or from the microkernel. It has no explicit control on scheduling nor hardware interrupts, so it cannot decide what it needs or wants to do at a given time. We do not want to add code everywhere to check if there is something more important to do (like receive an incoming network packet or disk block) or to manage explicit context switches when we can rely on Mach threads and the user-mode ``cthreads'' library. This library offers various synchronization primitives (simple locks, mutexes and condition variables) and hides most of the necessary synchronization of the underlying Mach kernel threads.
The system is serialized by a global mutex. Server threads must acquire it before doing anything sensible and release it when about to block.
Although this provides excellent performance, it means that the server functionality is shared between this emulation library and the server itself, leading to extra complexity and consistency problems; the server cannot really consider the emulation library like the rest of the user code, especially with respect to signal handling. The emulation library is not protected from user access and is therefore a potential Trojan horse for a malicious user. It is extremely complex (and inefficient) to protect the server against malicious usage of the emulation library's privileged communication to the server. Furthermore, multi-threaded applications imply even more complexity for the emulation library, which has to be fully-reentrant and has to identify the user threads.
Combined with the collocation, thread migration and short-circuited RPC microkernel improvements, this method has proved to have competitive performance.
User Memory Access
Having the Linux server running as a regular user task makes it harder for it to access the memory of its user processes. The monolithic Linux kernel just uses segment registers to get inexpensive access to the user address space, but the Linux server has to use the Mach VM interfaces. Since this is also a critical aspect for the overall system performance. We cannot afford to suffer the overhead of switching to the microkernel for each access to user memory. We map the necessary user memory areas into the server's address space using Mach VM services. Once the mapping is done, the server can access the memory without any performance penalty.
The device drivers are in the microkernel, but the server has to access them and let its processes use them. Linux handles device numbers and uses its own device operation routines. Mach names its devices with regular names (``console'', ``hd0a'', ``fd0a'', ``sd0a'', etc...) and offers its own device interfaces. In the Linux server, we just added a generic emulation layer, replacing the bottom half of most Linux device drivers.
The device specific code is therefore reduced to the initialization routine, and most of these routines only differ by the Mach device name they register.
``schedule'' calls ``condition_wait'' when the current task's state is no longer TASK_RUNNABLE and we call condition_signal whenever the task becomes runable again.
This problem has affected all Unix servers on top of Mach3. Some decided to solve this issue by adding an extra thread in the emulation library, to listen for messages from the server, and forcing the real user thread to check for signals when required. Despite the fact that it adds even more complexity to the emulation library, this could not be applied to the Linux server because we rejected the emulation library solution in the first place.
Our solution was to implement fake interrupts to allow the Linux server to regain control of a user process even if does not cooperate. The server takes control of the user thread, gets its state and jumps to the system call return code where signals will be processed. Race conditions with a possibly incoming or returning system call are avoided by suspending the user thread and making sure that it's suspended in a safe place, using the ``thread_abort_safely'' Mach service. Of course, thread_abort_safely will fail if there's an exception message on its way to or from the server.
The Linux server does not receive clock interrupts and the only way for it to count time is to use the Mach alarm services, which are obviously more expensive than a simple increment every 10 milliseconds. The OSF/1 server does not manage its own idea of the time and relies on the microkernel for that. It has a time-out thread which keeps on blocking and requesting to be woken up by Mach when the next OSF/1 time-out expires. Unfortunately, the wide usage of ``jiffies'' and our desire to maximize code-reuse forced us into emulating the Linux kernel's behavior.
A jiffies thread is woken up at regular intervals by the microkernel and increments ``jiffies'' by the amount of clock ticks that have passed during its sleep. The interval could be set to exactly one clock tick, in which case we would have the same clock precision as the monolithic Linux, at the expense of a context switch and some overhead every 10 ms. Although we have not measured the impact of this overhead yet, we currently use a 100ms interval.
The real time itself can be obtained more accurately from the microkernel by mapping it in the server's address space, and updating the Linux ``xtime'' global variable before sensitive uses.
The Linux page table management code could be discarded if Linux did not reference the page tables so widely. To minimize changes to the original Linux code, we chose to provide a machine-independent dumb emulation of the page table macros and routines. The Linux server does not make any sensible use of these page tables and they are mostly empty, but it allows more Linux code to compile and run unchanged.
Mapping a file is done by creating a memory object associated with the file and establishing the mapping with the ``vm_map'' Mach interface. Allocating zero-filled memory (for the ``brk'' system call for example) is done with the ``vm_allocate'' Mach interface. When removing a mapping, we use the ``vm_deallocate'' Mach interface.
This simple emulation has minimal impact on the original Linux code and covers the vast majority of the Linux VM operations. Unfortunately, we had to rewrite some Linux code to make it more Mach-friendly. For example, the ``brk'' system call shrinks a VM area by removing the old mapping and establishing a smaller one. This works in Linux because the page tables are not touched so the old memory is still there when the new mapping comes in place. Our emulation code discards the memory when removing the old mapping and cannot resuscitate it when establishing the new mapping. We just re-arranged the Linux code to avoid the ``remove and replace'' trick.
This is unnecessarily inefficient and restrictive, and we would like to get rid of this implementation in a future release. If the original Linux code did not use the mem_map array explicitly but hid it under macros or in-line routines, it would have given us freedom to implement whatever page allocation algorithm suits our architecture. This problem is an illustration of the advantages of modularity, which allows wider choices of implementation by reducing the interdependencies between system components.
The inode pager is currently a single thread running in the Linux server task. It manages the relation between a Mach memory object and a Linux inode, and replies to microkernel paging requests.
When a page-in request comes in, the inode pager reads the required data from the disk and sends it back. For a page-out request, the microkernel sends the page inside the page-out message and the inode pager writes it back to disk. Of course, the microkernel only sends back dirty pages and silently discards clean ones, so the inode pager never has to write back text pages for example.
The inode pager is also responsible for flushing a memory object from the microkernel cache when needed, for example when a mapped binary is recompiled.
The challenging part of a dynamic buffer cache for a Mach-based OS is that the buffer cache (in the Linux server) and the VM (in the microkernel) need to interact to let the system make the best use of the available memory.
Letting the buffer cache grow is the easy part: the Linux server manages only virtual memory and can therefore provide the buffer cache with more pages than there are in the physical memory. The tricky part is to get the buffer cache to shrink when the system is short on memory and before it starts paging.
OSF extended the external memory manager to offer ``advisory page out''. That is, instead of un-mapping a page and sending it to the memory manager, the microkernel can now leave the page in place, send a ``discard request'' to the memory manager and let it take any appropriate action. The Linux server can then use the ``try_to_free_page'' routine and free a page other than the one selected by the microkernel. Of course, the microkernel cannot be made to rely on an external memory manager to eventually free a page. If the memory manager does not free a page in time, the microkernel will send the data to the default pager, a privileged and trusted memory manager.
On the Linux server side, the only major change required is to allocate the buffer cache pages from a separate memory object, the size of the physical memory, and backed by another external memory manager.
The major part of porting the Linux server was to adapt the necessary header files for the Mach server. This took approximately one week to do, and once this was done the server was able to start to boot on the PowerMac.
Being able to use the Linux server on PowerMac machines was not simply a question of porting the server, commands and libraries were also needed, together with a file-system from which Linux could boot. We added some code to the Mach kernel to recognize the disk label and partition tables on a Macintosh disk, and ported the mkfs tool from the Minix distribution to create an initial populated file-system. As for commands and libraries, we were able both to build them ourselves and also to recover commands and libraries from an early binary distribution of native Linux on the PowerPC. Below is a list of steps taken in porting the Linux Server to the PowerMac:
The benchmarks were run on a DEC PC450, with a 50MHz i486 and SCSI disks, running a Slackware3.0 ELF distribution. The benchmarks themselves were in a.out format and were run on a 1kB-block ext2 file-system.
We profiled the micro-kernel and Linux server during those benchmarks and performed quick optimizations in two areas:
Next, we started investigating disk I/O performance. The default file-system block size is 1 kilo-byte on Linux. The Linux kernel is able to group disk requests into larger requests and doesn't suffer from the small block size. Neither the Linux server nor the micro-kernel perform such an optimization currently, and the penalty is made even worse by the extra overhead of Mach device interfaces. The result is that we read only one block per disk revolution.
We plan to do more exhaustive performance measurements and analysis in the future and to extend our tests to multi-processor platforms.
What is left to port is:
We will address both the functionality and performance issues as soon as possible.
Because we have always taken care to minimize the changes to Linux code, it is fairly easy to upgrade to new Linux kernel releases. We will provide a Linux server based on the latest 1.3 kernel (or maybe 1.4) as soon as possible, but our current priority is to complete the ports to the PowerMac.
The Linux server is built using the same method and tools as the regular Linux kernel.
Being isolated from the hardware by the microkernel, the Linux server can share the machine with other operating system servers. In fact, it was developed as a regular OSF/1 process, started from a shell and debugged with a Mach-aware version of GDB (which can handle multi-threaded applications). This is a very powerful way to debug the system. There is no need to reboot the machine before each test and it provides full user mode debugging possibilities; it is possible to debug the Linux server with GDB from its very first instruction.
Although Linux does not support multi-threaded tasks (at least not in the way we would like it to), we were able to start a Linux server from another Linux server. And more generally, one could run whatever any desired system personalities in parallel on a single machine.
The microkernel can be debugged using the powerful (although afflicted with a weird syntax) kernel debugger on the Intel x86 platform, or using a remote GDB for the PowerMac.
They are both freely available from the OSF Open Software Mall (http://www.osf.org/mall).
They partially ported their server to OSF MK in March 1995, but their server architecture did not allow them to take advantage of OSF MK's performance improvements. The Lites server may be ported to the latest OSF MK free release, sometime in 1996.
We could have used their work as a basis for our free UNIX server, but, because we do not use emulation libraries and have rather different server architecture designs, we preferred to start from scratch. We also wanted to demonstrate that a non-BSD UNIX could be implemented on Mach. OSF/1 and BSD-4.4 have similar VM implementations, derived from Mach's VM, making it fairly straightforward to emulate with Mach interfaces.
The GNU HURD is not yet available and we wanted to offer a development environment to our members and the research community as early as possible, so we produced yet another single server.
We are now able to offer a completely free and unencumbered development environment based on the microkernel. We hope that the research community will find this environment attractive for their microkernel related projects.
Thanks to Philippe Bernadat for helping us in measuring and analyzing the system's performance. Philippe Bernadat also helped with microkernel enhancements for the Linux server support on the Intel x86.
We also want to acknowledge the work of the team who has been porting native Linux to the PowerPC and especially Joseph Brothers, Daniel Puertas and Gary Thomas.
 Paul Roy, David Black, Paulo Guedes, John Lo Verso, Durriya Netterwala, Faramarz Rabii, OSF RI, and Michael Barnett, Bradford Kemp, Michael Leibensperger, Chris Peak, Roman Zajcew, Locus Computing Corporation. ``An OSF/1 UNIX for Massively Parallel Multicomputers''. OSF RI Collected Papers Vol. 2, 1993.
 Bryan Ford, Jay Lepreau. ``Evolving Mach 3.0 to use Migrating Threads''. UUCS-93-022.
 Rozier, M. et al. ``CHORUS Distributed Operating Systems''. Computing Systems 1(4), December, 1988.
 Dan Swartzendruber. ``A preemptible MACH kernel''. OSF RI Collected Papers Vol. 3, 1994.
 Franco Travostino. ``MACH3 Locking Protocol.''. OSF RI Collected Papers Vol. 2., 1993.
 Uresh Vahalia, ``UNIX Internals: The New Frontiers''. Prentice-Hall.
 Ed Burke, Michael Condict, David Mitchell, Franklin Reynolds, Peter Watkins, Bill Willcox. ``RPC Design for Real-Time MACH''. OSF RI Collected Papers Vol. 3, 1994.
 Franco Travostino and Franklin Reynolds. ``An O-O Communication Subsystem for Real-time Distributed Mach''. 1994 IEEE Proceedings of the Workshop on Object-Oriented Real-Time Dependable Systems (WORDS).
 N.C. Hutchinson and L.L.Peterson. ``The x-kernel: an Architecture for Implementing Network Protocols''. IEEE Trans. on Software Eng., vol. 17, no. 1, pp 64-76, Jan. 1991.
 Philippe Bernadat, Christian Bruel, James Loveluck Eamonn McManus and Jose Rogado. ``A Performant Microkernel based OS for the HP PA-RISC''. OSF RI Collected Papers Vol. 4, 1995.
 David Black and Philippe Bernadat. ``Configurable Kernel Project Overview''. OSF RI Collected Papers Vol. 4, 1995.
 Bill Bryant, Steve Sears, David Black and Alan Langerman. ``An Introduction to Mach 3.0's XMM system''. OSF RI Collected Papers Vol. 2, 1994.
 Joseph Caradonna. ``The Event Trace Analysis Package Design Specifications''. OSF RI Collected Papers Vol.4, 1995.
 Robert Haydt, Joseph Caradonna and Franklin Reynodls. ``Mach Scheduling Framework'', OSF RI Collected Papers Vol.3, 1994.
 Keith Loepere et al. ``MK++ Kernel High Level Design''. 1995.
 Simon Patience. ``Redirecting System Calls in Mach3.0: An alternative to the emulator''. OSF RI Collected Papers Vol.1, 1993.
 David Golub, Randall Dean, Alessandro Forin Richard Rashid, ``Unix as an Application Program'', Usenix Conference Proceedings, Summer 1990
 Philippe Bernadat. ``Microkernel benchmarking techniques''. Slides. OSF RI Symposium `93.
The OSF RI Collected Papers are accessible on the WWW at http://www.osf.org/os/os.coll.papers/.
François Barbou des Places received the MSc. in Computer Science from the University of Paris XI, Orsay, France in 1987 and a Diplôme d'Etudes Approfondies from the University of Grenoble, France in 1989. He also graduated from the Ecole Nationale Supérieure d'Informatique et de Mathématiques Appliquées in Grenoble in 1989. He is currently a Research Engineer at the OSF Research Institute in Grenoble.
Nick Stephen received a first class honours BSc. in Computer Science from the University of Southampton, England in 1990. He is currently a Research Engineer at the OSF Research Institute in Grenoble, France. His research interests include microkernel operating systems and distributed systems.
Franklin Reynolds has been involved in the computer industry for over twenty years. For the last six years he has been employed at the Open Software Foundation's Research Institute in Cambridge, MA. His current research interests include real-time and highly available distributed systems and active networks.