Dispelling Some Common Myths on AI for the Enterprise

Over the last few years, deep learning based AI has become the hottest technology in the enterprise scenario, and the expectations of what it can deliver have become quite unrealistic. Many of the industry leaders have come to the conclusion that AI is a magic bullet that will completely transform their existing businesses in the least and disrupt the market at its best. Here I will dispel some of the common myths surrounding AI, and discuss what is practicable.

What is AI?

Before we jump into the main discussion, however, it is important to ascertain the scope of what is included under AI, since marketing literature frequently includes data science or machine learning into “AI”. Data science is the utilization of the organization’s data to present intelligibly the current scenario about various business aspects, analyze correlation of factors and causes for events, predict future scenarios or unknown factors and prescribe judicious courses of action for problem scenarios. To harness data science, most enterprises today have a big-data architecture to collect, process and utilize data. Data science uses a host of machine learning models whose main characteristic is learning from data -- particularly input-output pairs. Deep learning is a kind of machine learning where the models are built out of layers of similar processing nodes, and their superb modeling capabilities and scalability is what distinguishes them as a class apart; their capabilities can be used to mimic some of the cognitive capabilities of humans as well as behaviors that we have traditionally assumed as intelligent -- hence they are regarded as a kind of AI technology. Based on its pervasiveness and dominance, the focus of the current article is deep learning based AI.

Myth 1: AI will Revolutionise Your Business

AI technologies add value to existing businesses through the following:

  1. Providing intelligent sensors: Every data driven organisation crucially depends upon being able to harvest diverse sources of data. Deep learning enables us to extract information from unstructured data sources like free flowing text, speech, image and video data. A simplified way of understanding this is to think of AI as providing modules that transform the unstructured data into entries in relational databases that can be processed by traditional models.
  2. Providing cognitive computing: Deep learning models can be upgraded in complexity by scaling up the network size, so that their performance for classification and regression tasks improves continuously as more data is used to train the models. They also simplify model building by making feature extraction, i.e. the operation of generating derived measurements that can be directly used by the model, an automated process, so that the feature extractor and the classifier/regressor merge into a single end-to-end trainable model. Last, but not least, a single model can incorporate inputs from diverse sources like text and images. All of these imply the scope of extending predictive modeling for the enterprise to areas that could not be addressed earlier.
  3. Providing intelligent interfaces: AI-driven interfaces can communicate with the user in natural language and engage intuitive methods of visualization. This particularly implies that the scope of query and search in intelligent interfaces can be greatly expanded to include dialogue and recommendation, chatbots being a key example.
  4. Providing automation: Wherever the process for execution is clearly defined, AI can be used to automate these. Examples are transcription of spoken dialogues, automated information extraction from completed forms, automated process monitoring and compliance etc.

While each of the above can add significant value to the data driven organization, these cannot fundamentally change the business, although carefully planned, they can provide strategic advantage. AI can become a market disruptor if and only if the strategic advantage extends to a tipping point.

The implication of this is that when business leaders consider the scope of deploying AI, they must focus on what can add value to the kind of business they are in, and employ AI in a contextual fashion rather than in a big-bang “holistic” way.

Myth 2: An AI Platform is All You Need

The term platform has come to mean many things to many people, but generally, it implies an environment that combines development and deployment of applications -- as when we are talking about PaaS. Deep learning based AI needs development tools like Tensorflow and PyTorch that are open source, and deployment tools like Tensorflow-Serving, TorchServe as well as containerization tools like Docker. Various model zoos host pretrained models that are sometimes useful out of the box, but mostly serve as the starting point of transfer learning explained later on. One way to look at an AI platform is as a collection of the above items, also equipped with machine learning lifecycle management tools like MLflow. Furthermore, looking at requirements of data governance on the same lines as big-data based analytics, one might associate the above tools as “bolt-on” modules onto a traditional data science platform.

However, to many afflicted by current AI hype, the expectation for an AI platform has skyrocketed into the form of a know-all virtual entity that internalizes all enterprise data and automatically cooks up the solution to all its information problems. That indeed looks a far cry from a collection of tools and models that is the current reality. It is important for enterprise decision makers to understand that most of current AI is extremely task-specific, and such model generation requires focused data and modeling requirements, none of which come about in any automated fashion.

Enterprise decision makers need to start with the consideration of AI solution candidates for high ROI problems, and work out the means of deployment in the organization once AI models are successfully generated. The AI platform can be restricted to items that are immediately needed for the problem at hand, and expanded gradually as the deployed AI shows returns and begs expansion to a larger scope.

Myth 3: The AI Model is the Complete Solution

AI models provide building blocks for solutions to problems in the enterprise, but the solutions are usually much more than their standalone components. For one, the solutions usually require customization to specific operating requirements. Frequently, there is scope for simplifications in the solution architecture to emerge as a result -- sometimes to the point where conventional machine learning techniques may suffice to solve the problem at hand. There is the larger question of how the AI modules are integrated into the overall workflow, and the question of how enterprise users access and utilize the solution as part of day-to-day operation once the solution is deployed. It is to be clearly understood that at the end of the day, it is the business KPIs that are of fundamental importance, and the end goal of every AI based solution is to meet these rather than the narrower goals associated with model training like precision and recall scores.

It is also important to realize that deep learning based AI, pretty much like the generic class of machine learning models, is liable to make incorrect decisions every once in a while. The enterprise AI designer has to take cognisance of this, and build mitigating mechanisms comprising uncertainty assessment associated with the AI models and the backup actions associated with model failure. One common mechanism is to enable human scrutiny for cases where the AI’s output is doubtful.

Myth 4: AI Models are Always Superior to Other Kinds

While deep learning based AI has the capability of providing superior models, it's certainly not a case of one-size-fits-all. For one thing we will observe that deep learning may not be a suitable option where sufficient quantity of quality data is not available for training -- something that we will explore further in the next section. However, even where sufficient availability of quality data is not a problem for training, deep learning does not often generate good models, and limited model interpretability hampers the distinction between good and bad.

One big problem with deep learning is that the models only perform well for the kinds of inputs that have been presented in the training set. For this reason, such AI may fail completely when confronted with novel circumstances e.g. airline pricing models in the wake of the worldwide coronavirus phenomenon. For everyday operation of such AI models, it is vital to identify when the deep learning model is operating with input data that does not agree with what the model has seen during training. Since most of the deep learning models today do not provide an estimate of uncertainty by default, it is never easy to determine when the model is providing a wrong result even for models that mostly tend to show a high degree of accuracy on the average. These problems often multiply with large models as common in the area of natural language processing.

Some deep learning models are also limited by how they are structured. Going back to the area of natural language processing, BERT-derived Transformer based models, although extremely capable, are limited by the size of text that can be input into them. Last year a new model called GPT-3 generated a lot of hope and expectations in terms of the breadth of NLP operations that it could handle. Unfortunately, its very strength in terms of the capability of general document completion is also its Achilles' heel; one never knows when it is generating a usable construct and when it is fabricating a story.

Yet another problem with deep learning models is that biases in the data used for training inevitably seep into the models. In view of this problem, methods have been devised to detect biases in data sets using statistical analyses, and fix some of these issues. However, for large NLP models based on deep learning, there is no current method to keep out the biases in view of the fact that it is impossible to curate text data sets that are today almost as large as the total bulk of text available on the Internet. This is the issue that Timnit Gebru's recent controversial paper expanded upon.

Myth 5: Any Enterprise Data Can be Used for AI Modeling

Human beings can effectively learn from a small number of examples, but that is not a quality that is shared by deep learning based AI. Rather the prerequisite for deep learning is the availability of vast quantities of high quality data. It is also to be understood that most of the deep learning models today require data samples in the form of input output pairs; for classifiers, the outputs are termed as labels. Unless carefully planned as part of the data design, labels have to be often generated manually, an arduous and costly exercise. This in turn often limits the size of the data that can be used for training deep learning models. However, deep learning does not work well with a small amount of training data, since large models tend to “overfit” under the circumstances while reducing model size restricts their modeling capacity. This implies that the availability of sufficient quantity of good quality training data is a strong limiter of deep learning based AI, and means that the prerequisite for enterprises to harness AI is to ensure the quality and quantity of labeled data.

There are some ways to train deep models using small data sets, but these come with their own limitations. The most common method for training deep learning with small data sets is to use transfer learning, a technique that involves retraining a model that has been developed on a similar data set with a different target. However, as of date, there are no foolproof methods to identify data sets that will necessarily provide good transfer learning performance for a given task.

A second approach is to use techniques that are known as semi-supervised learning algorithms and self supervised learning algorithms respectively, but these techniques are still largely experimental and do not provide for general purpose solutions.

The issues with data are not limited to the quantity alone. Frequently available data is imbalanced with respect to the training classes, and these require further calculated refinements to model training. Biases hidden in the dataset concerning age, gender, race or religious affiliation can generate models that are partial in their response to certain categories of people, and these require systematic analysis and normalization. Text and image datasets often require manual curation that can be at once expensive and labour intensive.

There are some who propose that data for deep learning can be synthetically generated using other deep learning models. This is not a practicable approach judging from the fact that it will take more data to train a model that will generate synthetic data with the same distribution; in other words building the synthetic data generator may be just as difficult as training the model itself.

Myth 6: AI Models are Easy to Generate

As a result of the interest that deep learning models have generated in the technical community, a variety of problems of interest to enterprises are constantly being attempted to be solved through this method. It's not uncommon today to find a deep learning model that solves a given problem through mere Googling. There are “model zoos” containing pretrained models for specific problems, recipes galore and the promise of “AutoML” or automated machine learning that all tend to imply that given the availability of a certain amount of training data, deep learning models are easy to generate; they also provide an impression that domain knowledge is redundant. The impact of such an impression is that newbees today are throwing deep learning models at problems before other approaches are even evaluated.

The reality is actually quite different. For one, deep learning architectures and training mechanisms have become diverse and complicated. For any given problem, architectural options, loss functions and training methods have to be carefully chosen as well as tailored to the quality and quantity of data. Failure to make the right choices will reflect in poor model performance, and this will just not go away if more training data is tossed in. A non-performing deep learning model can be a hard nut to crack, particularly since the main knobs that model builders can tweak comprise the data composition, the metaparameters for model training, the network size and the training regime; also, unlike conventional models, these models are not readily interpretable and therefore amenable to domain-knowledge driven modifications. AutoML solves the training problem for restricted categories of deep learning models only, and is primarily used to determine network size and autotune training metaparameters, and therefore only solves a small part of the larger training problem. The interested reader can find a more detailed description in this article.

Myth 7: AI Models are Intelligent Enough to Work in Isolation

A dismal view that often permeates within the enterprise workforce is that AI will eventually replace all human actors. However, this is unlikely for two reasons.

First, the minor one: AI based solutions, it has been mentioned before, are seldom cent per cent accurate in what they do. What this implies e.g. is that a company that is trying to automate its call-centers with intelligent chatbots will probably find 80% of its traffic being handled by these, whereas the 20% of the remaining more difficult and specialized cases will have to be handled by human experts. This implies that while the job of the attending workforce may reduce for easily automatable tasks, there will continue to be a requirement for more specialized roles. With increasing use of AI and ML, there will also be additional roles associated with verification of the models’ performance on an ongoing basis through restricted sampling on data, and triggering of model updates through relearning.

The major reason why humans will continue to be useful irrespective of the scale of AI is that decisions and their associated responsibility should only be associated with humans. The personal view that I subscribe to as an AI designer is that rather than replacing humans altogether, AI should be designed to make individual humans productive way beyond their current levels, and enable decision-makers with information and consultation on options at their fingertips. This is similar to the way clinical decision support systems have included AI for radiological analysis. Although the analysis performed by the AI system is frequently better than what can be done by a lone specialist, the latter is always endowed with the responsibility of taking final decisions, which might even mean completely overturning the recommendations of the radiological AI system. In short, this approach to designing AI based solutions aims to provide superior tools to the human actor, and keeps her on the central seat.

Conclusion

When decision makers for the enterprises look forward to launching their AI journey, they should do so preferably in measured steps, deploying AI solutions as extensions to their data science problems and verifying the value addition with each deployment. If used for the right purposes and with the right planning, AI will generate rich dividends, but its deployment should be guided by notions of return on investment and the availability of adequate quality data. AI based solutions will not serve organizational goals out of the box, but will need customization and integration into the enterprise environment. Also, the generation of AI solutions requires that the solution teams have a high degree of competence in the deep learning area to make judicious modeling choices and troubleshoot performance issues. It is necessary for AI planners to be aware of model inadequacies arising from biases embedded into data, model uncertainty as well as limitations of the AI models deployed. Last but not least, AI solutions should be built around people to enhance the value they contribute through a combination of information, analysis and recommendation.

Disclaimer: The views expressed in the above article are the author’s personal views expressed in personal capacity.


Subir Dhar

CleanTech | Sustainability | Digital Transformation

3 年

Excellent note! A must read!

回复
Pushpak Banerjee

Applied Data Science | Digital Transformation | P&L Management

3 年

A detailed articulation of myths that we are dealing with on a daily basis while implementing interesting use cases across the industry.

回复

Very well captured thoughts. A must disseminate.

Sean Banerjee

CTO/CPO in HealthTech | Multiple successful exits & IPOs | Proven Turnaround Agent in HealthCare Product & Engineering

3 年

Great job Puranjoy dispelling the myths with a balanced view. In the age of marketing over-exuberance, it is difficult to hear the voices that tell the truth.

回复
Ekta Raj

Senior Consultant | Generative AI | Computer Vision | Deloitte

3 年

That's an accurate observation.

回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了