Machine Learning: The future of Mind Mimicking Math Machines and the questions you should be asking!
Graphic header: Julia Steier, Product Specialist at SAP

Machine Learning: The future of Mind Mimicking Math Machines and the questions you should be asking!

Authors: Judith Li, Stanford graduate, Data Scientist at SAP Innovation Center Network and aspiring PhD & Sandra Moerch, Hawaii Pacific University graduate, Global EdTech Innovation Manager at SAP and aspiring PhD & Commentary: Gabriel MaherPhD student at Stanford.

The Era of Artificial Intelligence is here and it is based on a cocktail of data science, deep learning, statistics, data mining, machine learning and data visualization. Essentially all of these components translate into BIG math. This introductory article will provide a conceptual overview of Machine and Deep Learning based on the recent Stanford ICME Machine Learning Workshop. ICME stands for the Institute for Computational Math and Engineering, and is a world renowned leader in data science, conducting groundbreaking research in BIG math.

We want to kick this article off by giving you a bit of machine learning context and resources on how to actively become involved. As you have probably noticed, this article is written by two women. Judith Li, Stanford ICME graduate, Data Scientist at SAP Innovation Center Network and aspiring PhD and Sandra Moerch, Hawaii Pacific University MBA, Global EdTech Innovation Manager at SAP Next-Gen Labs and aspiring PhD. Besides from being passionate about technology we are also very keen on engaging more women in this field, which is why we are driving the annual WiDS (Women in Data Science) conference and movement out of Stanford every spring on behalf of SAP. Next year we will be supporting the event from more than 25 locations worldwide, broadcasting some of the most influential technical female leadership within data science and technology, live from Stanford. Learn more about WiDS and how to get involved here.

“Data science is about extracting relevant information from a large stream of data, and ultimately for driving informed decisions“  Margot Gerritsen, director of Stanford’s ICME

Machine Learning is not a novel thing, in fact Machine Learning has been taught for decades, but its potential is now finally being explored and exercised. In 1950 the first Machine Learning-esque exercise was performed; the Turing Test conducted by Alan Turing. Machine Learning is a sub-set of artificial intelligence where computer algorithms are used to autonomously learn from data and information. In machine learning, computers do not have to be explicitly programmed but can change and improve their algorithms by themselves. Since these early stages, Machine Learning has undergone a linear development of its concepts, however in recent years the impact has grown exponentially along with artificial intelligence. Therefore we believe that Machine Learning will be the intelligent driver of the exponential enterprise moving forward. It is one of the tools in the toolbox, likely the screw driver, that will help us dive deep into infinite pools of information.

One of the contributors to this exponential development is image recognition. Many of you have probably heard of Dr. Fei-Fei Li, a machine learning icon and fierce female data scientist. Besides from having millions of people follow her Ted Talk and thought leadership, she is the Director of the Stanford Artificial Intelligence Lab and the Stanford Vision Lab, where she works with the most brilliant students and colleagues worldwide to build smart algorithms that enable computers and robots to see and think, as well as to conduct cognitive and neuroimaging experiments to discover how brains see and think. Fei-Fei will also speak at WiDS in 2017 as well as SAP’s very own Dr. Tanja Rueckert, EVP LoB Digital Assets and IoT.

“First we teach machines to see, then machines help us to see better?—?this is my quest”  -Fei-Fei Lee, Director of the Stanford Artificial Intelligence Lab and the Stanford Vision Lab,

The marriage between big data and machine learning will grow the intelligence of machines and along with it, offer the opportunity of bettering the world. Just imagine a world where doctors will have tireless vision through machines, cars will drive autonomously and seamlessly navigate through traffic. Machine learning will also enable us to explore the new frontier through space travel and other galactic activities.

We are truly living in a legendary time

Machine learning has enabled the mining of big enterprise data to create more business values. The world’s most relevant enterprise data is part of SAP’s system, and there are tremendous opportunities in mining big enterprise data. The invention of databases like SAP’s HANA, Hadoop, Spark, etc., has enabled us to unify and process enterprise data much more efficiently. But what is next? Recall the excitement you felt when you upgraded your Nokia phone to the tiny super computer you are carrying around in your pocket today? What makes the difference is that you can now enjoy all kinds of apps that simplify your life. Now that the database has formed the smart phone infrastructure, more and more intelligent business applications will be built on top of the database infrastructure. Machine Learning is the core of what makes the applications “smart”. As the world’s most relevant enterprise data is part of SAP’s system and its business network, SAP and its innovation center network is developing machine learning use cases to make SAP business applications highly intelligent.

The 2016 ICME Summer Workshop Series provides introductory data science workshops on statistical data analysismachine learning and data visualization. Given the vast resources on data science, these workshops focus on clarifying basic concepts and building intuitions. After all, machine learning or deep learning is not a magic black box, it is just math. And machine learning models are known to be prone to bugs. Hence whenever you see a machine learning model, it is crucial to ask about how the model is built and think critically about the results.

Suppose someone presents you with a machine learning model, without digging into the mathematical equations, here are the questions that will help you better understand their model, and act like a machine learning pro!

  • Supervised or Unsupervised? The goal of supervised learning is to learn a mapping from the input to a label from the training data that can be generalized to the new unseen data. For example, the credit card fraud detection requires building a classification model to predict whether a new transaction (input) is a fraud (label) or not based on labeled historical transactions as training data. However, the goal of unsupervised learning is to discover interesting patterns or properties of the data and generate features to feed into a supervised model. For example, clustering (e.g., K-means) can be used to group similar customers that share common purchase preferences, and dimension reduction (e.g., PCA, ICA) can be used to create topics (or features) from news articles (or inputs).


  • How the data is wrangled? Ask the modeler how they handle the missing data or outliers. Certain machine learning models like the Tree can handle missing data better than others. Linear regression and K-means are sensitive to outliers so consider removing outliers before training models. Linear regression also performs poorly when highly correlated features that have little effects on the prediction accuracy are included in the training process. PCA requires that data have same units so make sure the modelers have centered the data when using PCA.
  • How do they handle overfitting? Overfitting happens when you see good performance on previously-seen (training) data, but poor performance on new data. Ask the modeler whether cross validation are used to reduce generalization error. Cross validation splits the data into training set, validation set, and test set, where the training data is used to learn the model, validation data for parameter-tuning or model selection, and test data for measuring performance on unseen data. Overfitting can be spotted when the training error decreases while the test error or generalization error increases. Make sure that the data used for tuning model parameters are not used for testing purposes. Otherwise the testing error will be underestimated. Machine learning models often include regularization, e.g., Ridge Regression, or impose sparsity, e.g., Lasso, as a means to reduce the complexity of the model in order to improve generalization.


  • Have they considered ensemble methods? Ensemble methods are perhaps the most popular machine learning techniques now. By turning machine learning models into features, ensemble methods improve the generalizability and robustness of the original machine learning models. For example, Tree is known to be unstable and prone to overfitting, but Random Forest is more robust as it essentially builds an ensemble of tree predictors and averages the individual prediction by each tree, a technique often referred to as bagging. Another ensemble technique is boosting, which turns a number of weak classifiers into a better classifier by letting these weak classifiers vote. For example, boosting-based XGBoost has been the winning model for several Kaggle competitions.
  • Accuracy vs Interpretability? If your business question is “what is the sale next week?” This implies what the customers care about is the accuracy of the sale prediction. Then ensemble methods would be your best bet. However, if your customers demand an explanation, e.g., “What are the factors that determines the sales?”, then you need a machine learning model that is interpretable. Linear regression model despite its simplicity, is fairly easy to be interpreted. While the ensemble methods are more accurate, they generally cannot provide interpretation easily. So bear in mind the trade-offs when you select a machine learning model. One of the key messages from this workshop is the “No free lunch” theorem. That is no machine learning algorithm will perform well on every task, but each machine learning algorithm will perform well on some tasks depending on your objective.
  • Have they considered using Deep learning? Deep Learning models are a class of advanced machine learning models. “Deep” indicates that there are lots of parameters in the model, commonly of the order from 1?? to 1?1?, hence regularization is often used in deep learning models to avoid overfitting. Deep learning model when they do work they tend to work very well, but it is very expensive to train. And when you don’t have that much data or high performance computers it is a good idea to just use APIs or pre-trained deep learning models instead of building one on your own. Convolutional neural networks (see CS231n for more) are particularly useful for computer vision problems where data has a spatial structure. Example applications are autonomous driving where images or videos need to be analyzed to identify events. Recurrent Neural Networks is used mostly for sequential data, e.g., natural language processing, time series data, machine translation, etc. Another very hot area of deep learning is reinforcement learning. AlphaGo developed by Google DeepMind is an example. AlphaGo consists of two convolutional neural networks that predict the best move to take and probability of winning and it learns by playing games of Go against itself. The machine learning based application actually beat the world champion of Go, Lee Sedol himself, despite the game being perceived as the most advanced board game in human history. A great resource for deep learning is the deep learning course on Udacity, see more resources in the reference section below.


  • Machine learning vs Deep learning: how complex is the problem? Particularly for people more on the project management side of things, something to consider with deep learning vs. regular machine learning is complexity. Machine learning projects typically follow a lifecycle of Idea → Building → Testing → Deployment. The more complex your machine learning algorithm the longer each stage is likely to take. So it pays off to investigate whether a simple method can solve your problem. Of course some applications are truly very complicated and can only be solved well with deep learning. Even then it is important to be aware that complexity is something that has to be managed carefully.

Today some of the best examples we see of applied Machine Learning is Amazon, representing one of the pioneers of machine learning based recommendation engines and price discrimination algorithms. Additionally we see a lot of examples in connected health care, smart cities, chat-bots, autonomy, credit card fraud, spam detection, real-time ads on web pages/mobile devices and as mentioned above, predictive consumer behavior. All of these mechanisms are bound to grow smarter and more sophisticated through repetitive testing and validation, and there is really no telling how intelligent machinery and technology will become. All we can predict based on our observations and thus far development of the framework, is that this technology is moving on an exponential curve, and that we are getting closer and closer to major breakthroughs, that will truly expand the application of machine learning in ways we could not even imagine.

As the machine gets smarter and more intelligent, experts expect an impact on the job market. According to this article by McKinsey, machines could automate 45 percent of the activities people are paid to perform.

The co-founder of Coursera and Stanford CS professor Daphne Koller emphasizes the importance of life-long learning and that learning should not stop when we finish high school or when we finish college. The MOOC (Massive Open Online Courses) platform has made learning more accessible and machine learning is one of the most highly sought after skills. In fact, 1 in every 6 learners on Coursera has enrolled in data science. OpenSAP is a MOOC platform within SAP and it offers on-demand technology learning/training and other critical IT knowledge.

To echo our message with this article. Exponential technologies such as Machine Learning, are definitely here to stay, and the resources on how to get even more familiar with the topic, are abundant. Machine learning moves fast with new techniques coming out almost daily. Being successful in Machine learning hence also requires a commitment to staying up to date. Here are some suggestions about how you can stay up to date:

  • Follow the various machine learning conferences (NIPS, ICML, ICLR)
  • Follow machine learning journals (many publications are freely available on Arxiv)
  • Follow the top researchers and companies in the field
  • Follow machine learning newsletter like DataScienceWeekly, Data Elixir

You don’t necessarily need to read each publication thoroughly, but even just looking at the titles and abstracts will keep you up to date!

Besides these free online resources, we strongly encourage you to pursue coursework through the ICME at Stanford, whether that be a full degree program or the great workshops that the institute hosts on campus and online. As industry partners through SAP, the value that academia provides through researchco-innovation and building up employee skills is immense, and we will continue to work closely with data science institutes worldwide, to bring inspiring knowledge and thought leadership into our organization and the SAP ecosystem.


COMMENTARY from Gabriel Maher, Machine Learning Facilitator and PhD student at Stanford

This article, prepared by Sandra Moerch and Judith Li, captures many of the most important aspects of machine learning and important resources for staying up to date. To touch upon a few subjects in a bit more detail and give a researcher’s perspective, Gabriel Maher, PhD student at the ICME and deep learning researcher, has provided some commentary.

“this article which help to disseminate valuable machine learning knowledge and connect people with resources that can help them get started or stay abreast of the field”  Gabriel Maher, Stanford ICME PhD

These days it is great to see a lot of interest in machine learning coming from industry and companies such as SAP. Even greater though are initiatives such as this article which help to disseminate valuable machine learning knowledge and connect people with resources that can help them get started or stay abreast of the field. Machine learning moves fast, hence initiatives such as these should only be encouraged. On that note, it is important to realize that since the field of machine learning moves so fast, it takes a serious commitment to staying up to date to be successful in it. Here are a few ways one could keep up with the latest developments in machine learning:

  • Follow the big global machine learning conferences. Here all the top researchers in the field get together to present their latest research findings. Information about the latest and greatest techniques can be found in the conference publications which are typically accessible online. Some of the conferences to be aware of are NIPS, ICML and ICLR.
  • Follow the big machine learning journals. Most machine learning researchers also actively publish their results online with very quick turnaround times. Therefore the latest methods can also often be found in journal publications. A lot of researchers also publish on Arxiv, which makes their work accessible online for free.
  • Follow the big machine learning companies. For a more industrial perspective, many companies also actively maintain online publications of their machine learning research. For example much of Google’s machine learning work is available online in the form of publications and can be found at https://research.google.com/

You do not necessarily need to read the publications in detail, but just browsing through the abstracts and titles goes a long way!

If you take a look at a lot of the recent developments in machine learning it seems like all problems can be solved using deep learning! Indeed for many applications such as computer vision and natural language processing which were challenging for standard methods, deep learning has proved to be a very useful technique. From my experience in industrial projects it is important to understand that deep learning is a great deal more complex than regular non-deep machine learning such as trees or linear regression. Machine learning applications typically follow an iterative life cycle of exploration, development, testing and finally deployment. Added complexity can make each of these stages take longer. For example debugging a deep learning application can take a lot longer than making a linear regression work. As such it is important to manage this complexity carefully. Additionally at the beginning stages of a project it pays off to investigate whether a simple method can solve the problem, as this can speed up development in later stages significantly. However some applications do really require sophisticated deep learning methods to work well, but even then it is important to be aware that this brings added complexity with it. Indeed when deep learning works, it typically works very well and can open up entire new business possibilities.

The rapid development in machine learning speaks for the potential that researchers and companies believe it has. By working together and by disseminating results and information with initiatives such as this article, industry and academia can unlock this potential. As such I encourage and look forward to the continued collaboration between companies such as SAP, Universities such as Stanford and research institutions such as ICME.




Below you will find the resources from the ICME Machine Learning workshop, as well as great literature on how to get started. All resources are free and accessible online.

  • Lecture Slides
  • 25 min R tutorial
  • 10 min Eigenvectors and Eigenvalues Review
  • An Introduction to Statistical Learning with Applications in R by Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani. The full pdf is freely available here.The datasets for this book can be found here and others here.
  • The Elements of Statistical Learning by Trevor Hastie, Robert Tibshirani, and Jerome Friedman. This comprehensive reference presents more material, and at a higher mathematical level, than the preceding text. The full pdf is freely available from the authors here.

Open Source Machine Learning community resources:

Some Deep Learning resources:


Khyati Ved

Product Manager | Proficient in Market Analysis, Competitor Research and SQL Data Analysis | Led customer outreach campaign that resulted in 200% increase in feature adoption

8 年

This is an awesome article!

回复

要查看或添加评论,请登录

Sandra Moerch的更多文章

社区洞察

其他会员也浏览了