About varocarbas.com

--

About me

--

Contact me

--

Visit customsolvers.com, another side of my work

--

Valid markup

--

Valid CSS

--

© 2015-2017 Alvaro Carballo Garcia

--

URL friendly

--

Optimised for 1920x1080 - Proudly mobile unfriendly

R&D projects RSS feed

All projects in full-screen mode

PDFs:

Project 10

Project 9

Project 8

FlexibleParser code analysis:

UnitParser

NumberParser

Tools:

Chromatic encryption

(v. 1.3)

Pages in customsolvers.com:

Upcoming additions

Failed projects

Active crawling bots:

Ranking type 2

(
)

Currently active or soon to be updated:

Domain ranking

FlexibleParser (DateParser)

NO NEW PROJECTS:
Project 10 is expected to be the last formal project of varocarbas.com. I will continue using this site as my main self-promotional R&D-focused online resource, but by relying on other more adequate formats like domain ranking.
Note that the last versions of all the successfully completed projects (5 to 10) will always be available.
PROJECT 9
Completed (57 days)
Completed (26 days)
Completed (47 days)
Completed (19 days)
Completed (14 days)
Critical thoughts about big data analysis
Completed on 02-Jul-2016 (57 days)

Project 9 in full-screenProject 9 in PDF

Numerical models are basically a way to extend human understanding to situations initially beyond our reach (e.g., too big/complex sources of information or automated decision-making). That's why all the data models, regardless of their configuration, pursue the same goal: reliably understanding certain reality, ideally as well as a person would do. Despite being intrinsically identical, not all the data-understanding problems can be faced in the same way, an idea which underlies this whole project.

Roughly speaking, any modelling process can be divided into the following constituent parts:
  • Training data. The past meaningful information used by the model to draw its predictions. The human-understanding equivalence is straightforward: all the information being accounted by a person to understand any situation and decide accordingly.
  • Model itself. The set of algorithms in charge of learning (i.e., adequately understanding all the training information) and predicting (i.e., outputting the most likely results for the given inputs). This part emulates all the human learning, understanding and deciding capabilities.
  • Resulting predictions. The conclusions delivered by the model for certain set of inputs. Eventually, the default model predictions might be corrected or complemented in order to ensure the highest accuracy; for example, by relying on an auto-learning subsystem. This part emulates the final outputs of a person understanding process (e.g., decision, guess, supposition, etc.), seen as a complex reality which might also involve interactions with other individuals.
Logically, the input conditions and the expectations have a major impact on the development of a model. By continuing with the human-understanding analogy, not everyone can analyse certain situations under certain conditions; for example: abstract impressions against detailed answers of an expert in the given field. The effect of the training information quality seems also quite evident (e.g., the worse the information, the more insightful has to be the person to adequately understand). On the other hand, the size of the training information might seem somehow secondary on this front, but it is certainly not.

As explained in the corresponding section, most of my numerical modelling expertise is focused on accounting for small-and-high-quality training information. One of the goals of this project is to share my impressions about the transition from such a background to big-data conditions. A second goal is to critically analyse the big-data aspects which might better be approached differently.