Ariel Tamches and Barton P. Miller,
Using Dynamic Kernel Instrumentation for Kernel and Application Tuning,
International Journal of High-Performance and Applications
13, 3 (Fall 1999).
Note: this paper contains several color pages. It should print acceptably on b/w printers.
We have designed a new technology, fine-grained dynamic instrumentation of commodity operating system kernels, which can insert runtime-generated code at almost any machine code instruction of an unmodified operating system kernel. This technology is ideally suited for kernel performance profiling, debugging, code coverage, runtime optimization, and extensibility. We have written a tool called KernInst that implements dynamic instrumentation on a stock production Solaris 2.5.1 kernel running on an UltraSparc CPU. We have written a kernel performance profiler on top of KernInst. Measuring kernel performance has a two-way benefit; it can suggest optimizations to both the kernel and to applications that spend much of their time in kernel code. In this paper, we present our experiences using KernInst to identify kernel bottlenecks when running a web proxy server. By profiling kernel routines, we were able to understand performance bottlenecks inherent in the proxy's disk cache organization. We used this understanding to make two changes-one to the kernel and one to the application-that cumulatively reduce the percentage of elapsed time that the proxy spends opening disk cache files for writing from 40% to 7%.