About varocarbas.com

--

About me

--

Contact me

--

Visit customsolvers.com, another side of my work

--

Valid markup

--

Valid CSS

--

© 2015-2017 Alvaro Carballo Garcia

--

URL friendly

--

Optimised for 1920x1080 - Proudly mobile unfriendly

R&D projects RSS feed

All projects in full-screen mode

PDFs:

Project 10

Project 9

Project 8

FlexibleParser code analysis:

UnitParser

NumberParser

Tools:

Chromatic encryption

(v. 1.3)

Pages in customsolvers.com:

Upcoming additions

Failed projects

Active crawling bots:

Ranking type 2

(
)

FlexibleParser raw data:

Unit conversion (UnitParser)

Compound types (UnitParser)

Timezones (DateParser)

Currently active or soon to be updated:

Domain ranking

FlexibleParser (DateParser)

NO NEW PROJECTS:
Project 10 is expected to be the last formal project of varocarbas.com. I will continue using this site as my main self-promotional R&D-focused online resource, but by relying on other more adequate formats like domain ranking.
Note that the last versions of all the successfully completed projects (5 to 10) will always be available.
PROJECT 9
Completed (57 days)

Critic >

Shortsighted goals

Completed (26 days)
Completed (47 days)
Completed (19 days)
Completed (14 days)
Critical thoughts about big data analysis
Completed on 02-Jul-2016 (57 days)

Project 9 in full-screenProject 9 in PDF

Many business opportunities have started to grow around (big) data analysis; a reality which has attracted quite a few not-too-skilled-on-this-front people. Usually, they have a blind trust in easily getting immediate benefits and disproportionate or plainly delusional expectations. Such attitudes tend to be near funding/decision-making spheres, what converts their unreliable opinions into actually influential trends.

Big data challenges are an excellent place to get an accurate idea about this kind of attitudes; actually, the appendix of this project includes a detailed description of my participation in one of these challenges. Roughly speaking, complex (meaning interesting in this context) problems are proposed to a heterogeneous group of skilled, competitive and motivated online data modellers. That's why and regardless any other factor, these contests definitively provide a good reference about what some big-data-concerned(-but-not-necessarily-knowledgeable) companies consider difficult, relevant and even the future.

After having participated in a few of these challenges and analysed a relevant number of additional ones, my ideas about the typical challenge-proposer expectations (at least, the unknowledgeable sub-type) are very clear. I see two main problems here:
  • Not-saying-much & easily-manipulatable assessing methodologies. In most of these contests the goals and the way in which solutions are assessed tend to be very simplistic and adapted to specific methodologies (i.e., problem expected to be solved in certain way and defined with this fact in mind). Some people might argue that this is required on account of all what a competition entails. In my opinion, this issue is exclusively provoked by not having properly analysed the problem and the goal; a new representation of the quick-easy-results-and-not-knowing-but-deciding attitudes which underlie this whole critic. Most of these challenges expect very specific answers to highly-restricted problems, but rarely output the ideal good insights into certain set of problems about which the given challenge should only be a mere descriptive sample.
  • Plainly useless goals. I have seen quite a few cases where the pursued goal was plainly useless for the proposer. Example: creating a model to recognise to which road, out of 5, certain stretch belongs. This is a clearly overfitting-prone problem whose conclusions will never have a general applicability (i.e., being able to recognise any road from a given stretch).
In summary, wrongly-applied data-analysis/maths can prove virtually anything, what is the same than proving nothing. Additionally to building a proper model, the right questions have to be asked and the delivered outputs have to be adequately understood.