First contact with open .NET
Completed on 13-Feb-2016 (47 days) -- Updated on 19-Nov-2016
Unexpectedly, I have spent most of my time here by trying to come up with the best approach to measure the performance differences between both methods (i.e., old & new versions of
). The fact that the original version was already optimised is the main reason explaining so many difficulties in this in-principle-easy part. Additionally, the optimisation process wasn't precisely straightforward (e.g., I found some of the Irrelevant issues
quite surprising), what avoided me to have clear enough ideas regarding the expected measurements (i.e., being sure about the exact effects of certain modification would have been helpful to quickly spot problems with the tests).
Although the basic structure of the main program didn't change appreciably since the start, the whole testing approach (i.e., the C# program, its inputs and the way in which the time differences were measured) had many relevant changes. Roughly speaking, it passed through the following stages:
Firstly, I relied on CoreRun.exe
because of assuming that this was the best way to emulate realistic conditions. As already explained
, this assumption was quickly proven wrong: CoreRun.exe
has an important negative (and not consistent) impact on the given .NET executable performance. Nevertheless, CoreRun.exe
is used for the last validation stage, where the modified
version is tested under as-realistic-as-possible conditions; this is done with ParseNumber_Validation.exe
which iterates through the
methods of various types (i.e.,
For my first tests without CoreRun.exe, I relied on two different executables (e.g., new.exe & old.exe). But back then the gap between both approaches wasn't too relevant and running two different programs represented an unacceptable increase in uncertainty. That's why I didn't try this option for too long.
Even before moving to the both-in-the-same-file approach, I was aware about the associated increase of uncertainty (i.e., running one method affects the time measurements of the other one). After trying different ways to minimise this influence (e.g., setting different pauses at different points, pre-warming or affecting
GC), I confirmed that the most reliable methodology was alternating the order in which the versions were measured (note that I also tested a random-order approach which was proven less reliable).
Then I moved back to two different executables; also tried to make the testing algorithm as efficient as possible to confirm whether these effects (i.e., smaller pauses between consecutive calls) had a relevant effect on the observed performance differences. The resulting application was the first version of ParseNumber_Test2.exe
, the definitive testing program. There were some relevant changes after this, but all of them are listed in the corresponding section