Kaggle and Machine Learning
I first came across Kaggle about two years ago, while searching for some datasets. I had really liked it at that time. Since then my admiration for Kaggle has only grown further. Of late, I am finding myself spending more and more time at Kaggle.
I am sure most folks in IT have heard about Kaggle, but for the benefit of those who have not, here is a quick overview of what Kaggle is about -
Kaggle is a platform that hosts open datasets and machine learning competitions.
- The datasets are contributed by companies, government entities and individuals. These datasets can be explored by anyone using languages like Python, R etc... Kaggle also has its own browser based analytics tool that they call Kaggle Kernels to facilitate data exploration. It is very similar to Jupyter notebook that most Python and R programmers are very familiar with. Alternatively, you can download the datasets that are of interest to you and explore them using your own tools. I generally download the dataset that I like to explore and do my analysis using Python and Jupyter notebook.
- The machine learning competitions are sponsored by various companies and Kaggle. The competing teams share their Kernels (Notebooks) for others to review,learn and comment. This makes these competitions a valuable learning tool for everyone, especially for the beginners of Machine Learning. Since most of these competitions are based on real world problems, the knowledge gained in participating in these competitions is valuable. Most of these competitions offer good prizes. These competitions are worth a try just for the learning aspect alone. The prize money is just gravy.
That is a quick overview. I strongly recommend that you do check out Kaggle if you haven't already done so.
About a couple of months back, I had downloaded a dataset called 'Global Terrorism Database' and done my analysis and uploaded my analysis to Kaggle. You are welcome to review it and comment. Here is the link - Global Terrorism Analysis before and after Y2K.
I have just joined a competition called Spooky Author Identification on Kaggle. Not expecting to win any prizes but expecting to learn a lot about natural language processing (NLP) and Python's Natural Language Tool Kit (NLTK). I will post an update in a few weeks!
Digital Supply Chain | Blockchain I Pre Seed & Seed Investor
7 年Very good overview about Kaggle for novices
Great ram nice work