Article
Lightweight Application Instrumentation with PCP
Wait... what? I was involved in diagnosing a production system performance problem: a web application serving thousands of interactive users was acting up. Symptoms included significant time running kernel code on behalf of the application (unexpectedly), and at those times substantial delays were observed by end users. As someone with a systems programming background, I figured I had a decent shot at figuring this one out. Naively I reached for strace(1), the system call and signal tracer, to provide insights...