newsmode MarketNews
arrow_back К списку
rss_feedEugene Yan ·20.06.2015 open_in_newОригинал

DataScience SG Meetup - How we got top 3% in Kaggle

DataScience SG Meetup - How we got top 3% in Kaggle

[ machinelearning ] · 1 min read

One Saturday afternoon, I volunteered to share about my recent effort in Kaggle’s Otto competition where I placed 85th / 3514 with my fellow competitor Weimin.

Given that it was a lazy Saturday afternoon, I did not expect the lecture room at SMU to be fully packed. The data science meetup scene in Singapore was more vibrant and hotter than I thought.

In approximately 45 minutes, we shared about how we thought about and had an in-depth discussion with the audience on the topics below:

  • The evaluation metric (multi-class log loss)
  • Validation approaches
  • Feature engineering and selection
  • Feature transformation (e.g., standardization, log-transformation, tf-idf)
  • Creating aggregate and t-sne features
  • Machine learning techniques (trees and neural nets)
  • Ensembling techniques
  • Top solutions and architectures
  • A suggested framework for Kaggle competitions
  • More details can be found in the slides below.

    Questions? Want to follow my journey? Reach out on Twitter @eugeneyan!

    If you found this useful, please cite this write-up as:

    Yan, Ziyou. (Jun 2015). DataScience SG Meetup - How we got top 3% in Kaggle. eugeneyan.com. https://eugeneyan.com/speaking/dssg-kaggle-top-3-percent-talk/.

    or

    @article{yan2015kaggle, title = {DataScience SG Meetup - How we got top 3% in Kaggle}, author = {Yan, Ziyou}, journal = {eugeneyan.com}, year = {2015}, month = {Jun}, url = {https://eugeneyan.com/speaking/dssg-kaggle-top-3-percent-talk/} }



    Join 11,800+ readers getting updates on machine learning, RecSys, LLMs, and engineering.