First contact with open .NET

Completed on 13-Feb-2016 (47 days) -- Updated on 19-Nov-2016

Project 7 in full-screenProject 7 in PDF
Code overview >

As explained in the first section, I got a quite good impression about the overall quality of the open .NET code (i.e., not just in CoreCLR, but also in CoreFX and Roslyn): very clear structure, efficient algorithms, descriptive comments, etc. All the parts analysed here (i.e., various methods in the file Number.cs) are certainly not an exception to this statement.

This project is focused on the optimisation of ParseNumber and associated resources. All its code is written in (almost-exclusively-)unmanaged C#, although C++ is also used in Number.cs even in parts which are closely related to ParseNumber. The following three different methods will be considered:
  • Boolean ParseNumber(ref char* str, NumberStyles options, StringBuilder sb, Boolean parseDecimal): it analyses the inputs to the given parsing method, like Decimal.Parse (i.e., string to be converted into number and arguments taking care of Additional issues, like culture-related format); then outputs the numerical characters which will be later transformed into the given type (i.e., decimal) by NumberBufferToDecimal. All this code is unmanaged and highly optimised (e.g., relevant usage of bitwise operations). Most of the performance-improvement modifications of this project were precisely done to this method.
  • char* MatchChars(char* p, char* str): it is called from ParseNumber, while looping through the characters in the input string, and returns the position after the given target (str). There is a second overload (char* MatchChars(char* p, string str)) allowing the target to also be string.
  • Boolean IsWhite(char ch): very simple method checking whether the given character is blank. Anecdotally and unlikely the two aforementioned methods, this is managed code.
Ideally, this algorithm should be optimised by completely rethinking how the problem is being faced; specifically, the not-too-practical/efficient way in which the input information is stored provokes unnecessary problems. For example: over-complicated/less-efficient algorithms or over-usage of resources (e.g., most of potential inputs aren't analysed). Nevertheless, such an expectation is beyond the scope of the current project and its final goal (i.e., pull request to CoreCLR, whose likelihood to be accepted would be much lower in case of including modifications in existing features). The decimal.TryParseExact method which I am planning to create in the near future (likely to be part of Project 9) will certainly take care of these issues, also account for various functionalities not supported by the current parsing approaches.