Introduction >

UnitParser code (.NET/C#) >

Unit parsing

NumberParser code (.NET/C#) >

DateParser code (.NET/C#) >

FlexibleParser code

UnitParser code (.NET/C#) >
Unit parsing
The part of the code in charge of managing the unit parsing aspects is evidently one of the most important parts of a unit parsing library. In fact and contrary to what it might seem, all the UnitParser functionalities pursue the same goal: setting up a consistent environment allowing its unit-parsing expectations to be fulfilled.

UnitParser can easily extract valuable unit information from a wide variety of raw data (i.e., string variables with different unit-related contents). Its main features on this front are the following:
  • Supporting multiple representations for each individual unit. Examples: exact symbols (case matters), full names (case doesn't matter), commonly-used abbreviations (case doesn't matter), etc.
  • Gracefully dealing with different types of unit prefixes. Example: the gram-kilogram relationship isn't managed via conversion, but presence/absence of the SI prefix kilo.
  • Perfectly recognising all the individual units (and, eventually, their prefixes) which form a multiple-unit compound. Example: new UnitP("km/s") understood as formed by two units, metre (with prefix kilo) divided by second.
  • Uniquely defining each single unit by focusing on its constituent elements. Example: newton always understood as new UnitP("kg*m/s2").

Delivering the aforementioned functionalities isn't easy because of the huge number of possible scenarios, even before considering further issues like auto-conversions (discussed in the next section). Nevertheless, it is possible to differentiate two main parts:
  • Individual unit parsing. This code is quite simple and is mostly stored in the file Parse_IndividualUnits.cs.
    It has to be noted that, in this context, individual units refer to those which are perfectly matching a supported named unit. Example: new UnitP("N") (newton) is treated as an individual unit, but not variations like new UnitP("N2"). This definition doesn't agree with the individual-compound bipartition which is used in the remaining parts of the code. Such a differentiation obeys to practical reasons (i.e., immediate recognition vs. further analysis), is intuitively understandable and doesn't affect the true essence of the units (e.g., new UnitP("N") is parsed as an individual unit, but recognised as a compound defined by kg*m/s2).
  • Compound parsing. Its main code is stored in the files contained by the Folder /Parse/Compounds/.
    This code is much more complex than the aforementioned one for individual units. Such an increase in complexity isn't linear due to the numerous Additional issues which have to be taken into account, namely:
    • Simplifications and further modifications (e.g., compensation of prefixes) to ensure a proper recognition.
      For example: new UnitP("kg*m/s2"), new UnitP("1000 g*m/s2"), new UnitP("kg*Mm/ks2"), etc. have to be treated identically (i.e., 1 newton); even cases like new UnitP("g*m/s2") have to be understood as scaled-down versions of the main unit (i.e., 1 millinewton).
    • Actions required to deliver a consistent performance for both dividable and non-dividable units.
      By considering the energy divided by time scenario (i.e., power), new UnitP("J/s") follows the metric rules (all the units can be divided into the same basic constituents), but new UnitP("BTU/s") doesn't (BTU cannot be divided into simpler units).
    • Actions required to address all the eventualities provoked by the allowing-any-input policy.
      For example, trying to understand the actual user intention by ignoring some irrelevant mistakes.