Critical thoughts about big data analysis

Completed on 02-Jul-2016 (57 days)

Project 9 in full-screenProject 9 in PDF
Appendix >

This project is the main output of my recent intention of getting insights into the differences between my numerical modelling expertise and big data conditions. I took part in various open challenges, although only spent a relevant amount of time and effort in the one described below these lines.

In this appendix, I am including my impressions about my participation in Kaggle's Expedia Hotel Recommendations. Most of the information associated with this challenge isn't public, that's why I may only share certain bits (e.g., data description).

Note that, since the very first moment, I took this challenge as the ideal benchmark to understand the aforementioned differences; also to eventually build a reliable set of applications, even to come up with a whole proceeding, helping me to quickly and efficiently face future big-data problems. It seemed that focusing on the test dataset (i.e., making many submissions) was the best way to accomplish such a goal. That's why my numerous submissions in this challenge, what shouldn't occur under normal conditions.