You'd expect code compiled with the gcc option -O3 to be faster than -O2 code, right? After all, the compiler spends more time on optimizing your code, and these are well-known optimizations that were deeply studied. So, would it surprise you if the code ended up being, say, 7% slower?
You test the code a day later, and suddenly, the very same code is 10% faster with -O3 as compared to -O2. What's going on here?
Let's see. Perhaps you've added or removed some totally unrelated environment variable?
Sounds odd, right? But this is exactly the behavior documented by the authors of Producing Wrong Data Without Doing Anything Obviously Wrong! (appeared in ASPLOS '09). They explain that a change in the size of the shell's environment can change the binary's alignment, causing noticeable differences in performance. They also show, although it is less surprising, that changes in the link order of the binary can have a measurable effect on performance.
The paper's claim is obviously not "play around with the environment size to find the performance sweet-spot". It is that measurement bias, a phenomenon well-known in experimental sciences, also exists in computer science experiments. They show that it exists regardless of the CPU, the benchmark used, or the compiler employed. Measurement bias in computer science is real, significant, commonplace, and unpredictable.
And thus, when developers try to measure the effect of an optimization, or the effectiveness of a new algorithm (when compared to an existing one), they should never be content with measurements -- even multiple measurements -- in a single settings. Change configurations; change platforms; play around with anything you can; randomize your setup; see if the performance difference that you suggest is real, or an artifact of the conincidental configuration you were using.
Will it work? Will the authors' "call for action" (section 8 in their paper) yield any results? I must admit I'm pessimistic. I think the only chance for this to have any effect is if conference reviewers will start demanding that experiment reports in papers include details about the measures taken to overcome measurement bias.
Until this happens, I cannot help but wonder how many times skillful craftsmen had devised great optimizations and applied them to their code, only to find out the code ends up being slower. The programmer shrug, not seeing what went wrong, and revert to the original, "more efficient", implementation. If only they had deleted one environment variable...
Saturday, July 11, 2009
Subscribe to:
Post Comments (Atom)
0 comments:
Post a Comment