About varocarbas.com

--

About me

--

Contact me

--

Visit customsolvers.com, another side of my work

--

Valid markup

--

Valid CSS

--

© 2015-2017 Alvaro Carballo Garcia

--

URL friendly

--

Optimised for 1920x1080 - Proudly mobile unfriendly

R&D projects RSS feed

All projects in full-screen mode

PDFs:

Project 10

Project 9

Project 8

FlexibleParser code analysis:

UnitParser

NumberParser

Tools:

Chromatic encryption

(v. 1.3)

Pages in customsolvers.com:

Upcoming additions

Failed projects

Active crawling bots:

Ranking type 2

(
)

FlexibleParser raw data:

Unit conversion (UnitParser)

Compound types (UnitParser)

Currently active or soon to be updated:

Domain ranking

FlexibleParser (DateParser)

NO NEW PROJECTS:
Project 10 is expected to be the last formal project of varocarbas.com. I will continue using this site as my main self-promotional R&D-focused online resource, but by relying on other more adequate formats like domain ranking.
Note that the last versions of all the successfully completed projects (5 to 10) will always be available.
PROJECT 9
Completed (57 days)
Completed (26 days)
Completed (47 days)
Completed (19 days)
Completed (14 days)
Critical thoughts about big data analysis
Completed on 02-Jul-2016 (57 days)

Project 9 in full-screenProject 9 in PDF

An increasingly big number of businesses are realising about the numerous benefits of adequately understanding their clients' data. Additionally, internet and the associated huge amount of valuable information have played an important role in the wide adoption of data-analysis techniques. In fact, so much interest and availability of (free and easy-in-appearance) resources have provoked situations where data models are heavily misused. This section deals precisely with one of the main consequences of such a reality.

With data hording, I refer to a widely-spread-among-online-businesses attitude involving the collection of as much information as possible without properly analysing it. Such a (mis)proceeding has been somehow supported by the irruption of numerous big-data tools, commonly misunderstood as easy ways for people from any background to intuitively get worthy conclusions.

The aforementioned misconception has a negative impact on different fronts:
  • The underlying assumptions about data analysis (i.e., it is easy, anyone can do it, generally-applicable and absolute answers can be expected, etc.) provoke negligent behaviours and the information to not be properly maximised. The most likely consequence of this point is the allocation of disproportionately-restricted resources; for example: unexperienced analysts, too limited budget/time constraints or unrealistic expectations.
  • The more information, the more difficult to create a reliable model. In fact, notable increases in the amount of information being accounted usually provoke (even beyond-acceptable) increases of noise, what makes very difficult to create a model.
    Thus, more data is only better if the following two conditions are met: the quality of the additional information is high enough (or, at least, the noise increase is kept under control); and the given model is properly adapted (i.e., tuned, extended or even re-built) to account for all the additional information. Logically, this level of care is incompatible with the described essentially careless behaviours.