In the paper titled Evaluating the Accuracy of Java Profilers (PLDI '10), the four authors, Todd Mytkowicz, Amer Diwan, Matthias Hauswirth, and Peter F. Sweeney show how four different, popular profilers for Java programs completely disagree with each other regarding the performance profile of the programs they analyze. These profilers are xprof, hprof, jprofile, and yourkit; and I am rather certain that many other profilers suffer from the exact same problem.
The authors show that the profilers provide surprisingly inconsistent answers, and then -- in the beautifully-written Section 6 of the paper -- explain exactly why this happens. Consider, for example, the following simple program (Listing 1 from the paper):
static int[] array = new int[1024];Clearly, this program spends a small amount of time in method cold proper, and most of its time in hot; the method that profilers should point at is hot. Yet xprof, for example, attributes 99.8% of the runtime to method cold. Other profilers provide similar answers.
public static void hot (int i) {
int ii = (i + 10 ∗ 100) % array.length;
int jj = (ii + i / 33) % array.length;
if (ii < 0) ii = −ii;
if (jj < 0) jj = −jj;
array [ii] = array[jj] + 1;
}
public static void cold() {
for (int i = 0; i < Integer.MAX_VALUE; i++)
hot(i) ;
}
How come? As it turns out, these profilers (and presumably others) sample the program not in a uniform and randomized manner, but rather by relying on yield points in the compiled code. In this particular example, method hot has no yield points, so it is practically a blind spot to the profilers -- a stealth method, if you will! (This probably indicates that the problem is largely limited to profilers in managed environment, such as profilers for Java or C# programs.)
Yet if they all suffer from the same problem, why do different profilers still disagree with each other (while all being wrong)? Because of the observer effect, something Heisenberg could have told you about ages ago. Basically, the profiler itself affects the JVM and the runtime environment (memory usage, paging, cache state), and each profiler does so differently, affecting in a different manner the very program it attempts to measure.
In Section 7, the authors describe a proof-of-concept profiler they have developed, and show how it is significantly more accurate than the other profilers examined. Let's hope the techniques presented in this work will quickly find their way to real-world profilers.
0 comments:
Post a Comment