Short term courses on data science topics...
Data science is now a buzzword with more hype than sense. Anyone who does anything with data becomes a data scientist, just like calling any website as being in a cloud. Other relatives of this buzzword are big data, analytics, etc and there are second level relatives like text analytics, recommender systems, predictive analytics, and the like. Most big data courses is about Hadoop, with a bit of map reduce. Occasionally a bit of R, which by the way, is not a bad idea. Often the "science" and the "analytics" of the data goes to a second or third place, behind these popular jargon tools.
Data science has different aspects: acquisition, storage, processing, etc in terms of the different stages. Orthogonal to this is the structure of the data -- text, images, video, numbers, records, etc and its semantics. This defines the storage standards, and the basic access primitives. The volume of the data is a major, but not the only major, component. The other 'v's like velocity, veracity, etc are also important in devising efficient algorithms for acquisition, storage and processing. When it comes to processing, there is an ocean ahead. The purpose of processing is a key driver -- prediction, modeling, trend analysis, outlier analysis, clustering, and so on are examples. Each offers a multitude of algorithms depending on the many factors mentioned above. When the data is not adequately clean or complete, there is another aspect coming into play -- data preprocessing, which includes data transformation as well.
Thus looks the space of data science, and it is not to be reduced to hadoop, R, or any of the few popular tools. Tools are, just tools -- a distinction that is often lost, in our obsession with tools. Quite like a fresh CS graduate saying "Java" or ".NET" is his favourite topic in computer science!
CDAC Mumbai is putting together a few courses in the broad space of data science. They cover little islands in the space outlined earlier. Connect these islands to the broad space, and you can use them effectively.
The courses cover R (an open source and excellent tool with great support for data representation, statistical processing, and visualisation), predictive analytics (one of the popular interest area in the 'processing' segment), and text analytics (a high potential area, but with many hurdles). We will use mostly open source tools in the course, so that you can go back and practice them on your own. And many of these tools are as good (and more extendable!) as their commercial counterparts, for most requirements.
Please check the site kbcs.in/datascience for more details, registration information, etc. There will be more courses coming up later, looking at some of the other areas.
Senior Backend Engineer | Retail & Connected Cars Platform Developer
8 年Sir, what's the difficulty level for the course and will it cover significant sections in this short amount of time ? (please answer specifically to predictive analytics course) > Can you please name the other two courses(as mentioned in blog) so that I can finalize my registration ?
Founder & Lead Developer at academicum.ai | Building AI-Powered Tools to Enhance Academic Research and Learning
8 年I don't know about this CDAC Mumbai course but I'm quite happy with Coursera Data Science Specialization, as it includes R Programming, Cleaning Data, Exploratory Data Analysis, Statistical Inference, Regression Models, and Machine Learning, closing with a capstone project on developing a data science product.
Founder and CEO at Sarvaha Systems | Trusted Software Development Partner
8 年Great idea, M Sasikumar! How would these compare to similar courses from Coursera (especially Andrew Ng and Manning)?