Predicting The Shinkansen Experience - Hackathon
Suraj Mathew Thomas
???Innovative Product Management - Enterprise Data & Governance - Data Analytics Practitioner - Driving Growth & Digital Transformation???
It's been nearly 4 years since I participated in a Hackathon. I still remember the good old IT days where you scramble around the floor modifying your build, checking results, the excitement and adrenalin rush with never ending cups of coffee and munchies. Competition has always got the better of most Indians because that's how we are conditioned from our early days. Much hasn't changed and IT companies knew that to get the best out of their employees Hackathon weeks were the ones to look forward to. Today Hackathon is not synonymous with IT companies. Every company in every industry is trying to be a powerhouse in all their functions to create that unique USP for their products and services to be more client centric. The world has changed and Hackathons transcend beyond building technical modules to refining processes and Machine Learning models.
This weekend I was fortunate to get an opportunity to participate in a model build out as part of a Hackathon conducted by Great Learning and the McCombs School of Business at the University of Texas Austin. This is my first virtual hackathon and the feeling was entirely different. The excitement was there, but hackathons conducted on the floor in a physically setting is a totally different animal. But I guess times have changed, triggered by the crisis that prevails in the world today which calls for us to adapt and do things the new way.
The problem given was to predict the overall travel experience of commuters who travelled by the Japanese Shinkansen. Japan is famous for its Shinkansen (otherwise called bullet trains). They are known for their convenience, high speed and punctuality. Four datasets were provided which contained the train and test for travel details and survey results. The target variable was to predict whether a commuter was delighted or disappointed with the Shinkansen experience.
The datasets given were pretty messy with missing data which required logical imputation, merging; Pretty much most of the pre-processing steps had to be done before feeding the data into a model. An ensemble of models such as Decision Trees, Random Forrest, Bagging Classifier, Gradient Boosting Classifier, Ada Boost Classifier, Logistic Regression, Support Vector Machine etc. were built to check the accuracy of the model when the test data was fed in.
The accuracy is the only metric with which this hackathon is judged which in the real world is not practical. But hey this is a game and rules are based on hitting the highest accuracy. So who am I to bend it.
领英推荐
This is my current ranking based on the ask as on 27th Feb 2022 8:30 am IST .
The Leaderboard is one thing that helps you continuously work on improving your model against yourself and your competitors. This is a culture that needs to be woven into every data science professional's arsenal. There is another 13.5 hours to go before shutters are called for this hackathon and the winner is adjudged.
Win or Lose, this has been a superb learning experience and brought back memories of how Hackathons are. My appreciation goes out to the McCombs School of Business, Great Learning, its mentors and professors. Thank You for the opportunity and the enriching learning experience.
#hackofalltrades #greatlearning #utaustin #mccombs #greatlakes
???Innovative Product Management - Enterprise Data & Governance - Data Analytics Practitioner - Driving Growth & Digital Transformation???
3 年Final Standing at the end of it. Not bad I suppose considering the fact that hyper tuning of the parameters takes time to run the model based on the computer resources you have. 87 submissions to improve the accuracy rate by 0.0016