Artificial intelligence and Data Science in clinical research and healthcare– a Data Scientist's perspective from trenches
Abstract
Artificial Intelligence (AI), Machine Learning (ML) and Deep Learning (DL) are three hot buzzwords these days. Then, when we add “Data Science” to this mix, things get even more interesting in so many ways.
In this article, my promise to the reader is to explain AI, ML and DL with examples, not invoking abstract math, statistics, computer algorithms and simulation techniques. AI is too important to be ruining its juice by being fully associated with maths and stats and advanced algorithms - no fun there.
Another goal of this article is to cover enough on these topics to appeal to readers of various needs and appetite. This constituent of readers is a big umbrella, consisting of technology executives, aspiring data scientists, data scientist practitioners, data scientist experts, data engineers, cloud engineers, business analysts, big data programmers andopen source technology hawks, just to name a few.
For business and technology executives, this article gives example-based definitions of AI, ML, DL and DS, which they can start to connect and understand from a business and technology perspective. If things become too abstract, they don't find it interesting. In their eyes - the bottom line is “Show me the money and how it will help grow the business.” Period. This is one main reason this article is example-oriented.
For AI practitioners, this article talks about what "data product" is, and the process and software engineering discipline that's required to create such AI-driven smart and intelligent data products.
For AI experts and evangelists, this article devotes a full section on how to institute AI strategy in a methodical way and build a data science practice from a practitioner's point of view. While there is no silver bullet, each experience is different, each predictive algorithm has nuances and each data product is inherently unique. The good news is that we have proven methods, techniques, processes and software engineering disciplines, which when collectively employed, can make a difference while executing an AI strategy within an enterprise. AI is disruptive; just think what Amazon is doing and what Google is doing.
For technologists and platform xperts, as a reference, this article includes a section on modern AI tools, software andopen source platforms, with links, for additional details and research.
For generalists, this article presents an overview of AI, ML, DL and DS.
What this article is NOT intended to cover, however, is algorithm, model design and development, data integration pipeline, how to train-test-validate a model, how to make a model as part of an application and how a model can be a part of a micro service enabling data integration and application integration using services-based architecture. For this, you need to wait for my next article on this topic, focusing on HOW.
Introduction
People, and many times even AI practitioners, use AI, ML and DL interchangeably, but as we will see soon, while these terms have relationships with each other and they oftentimes build off each other, by no means can they substitute for each other. Each has its own lane, own progression, own innovation - this is a critical take home piece.
As I define AI, ML and DL within the broader confine of Data Science, I will also explain best practices around these topics. Most importantly, I will cover what it takes for a company to be successful in strategizing AI and Data Science Practice, and executing the strategy and data vision to create AI-driven data products on sophisticated and scalable data processing platforms, preferably deployed in the cloud, for data monetization. At the end of the day, commercially speaking, AI-driven data products need to solve real business problems, and this article will give you a path to accomplish exactly that.
It might be surprising to hear a Chief Data Officer say “Data is the problem”. But, I want to explain, in context, that data is opportunity as well.
For example, data volume can create problems. Data variety (transaction, XML, JSON, flat file, binary, images, structured and unstructured) can create issues.Data quality can put a dent on your business process. And, data veracity can instill sustained quality issues.
Data volume can also create scaling issues within your systems, particularly if the data is flowing from devices and sensors as they arrive at enormously rapid speed so you need a scalable data intake and processing architecture. Not to mention, real time data is inherently messy with data gaps and data quality issues. But there are opportunities in dealing with the complexities of volume, variety and veracity of data, including the opportunity to extract insight from these myriad data points to find trends and correlations that are hidden through the application of AI, ML, DL and DS.
Artificial Intelligence and Algorithms are all around us, whether we see them or not. Successful companies such as Google, Facebook, eBay, Amazon, Netflix and Nvidia have built their businesses by exploiting the full potential of AI. Artificial Intelligence has swiftly reached our daily lives, and before we could realize the potential of AI, it has already impacted our lives beyond measure.
There are robotics systems with AI being used daily to tackle difficult situations such as fire-fighting. We have advanced vacuum cleaners, dishwashers and lawn mowers with intelligent features that perform household chores with minimum human input. Sophisticated burglary alarm systems keep our homes, cars & precious belongings safe. Consider how Google’s proprietary algorithm in the driverless car functions as the connective tissue that combines the software, data, sensors and physical asset into a true leap forward in transportation. After all, what makes Google one of the most valuable brands in the world? It isn’t data; it’s the company’s most closely guarded secret, its algorithms.
High frequency trading is another example. A trader’s unique algorithm drives each decision that generates a higher return than competitors. The algorithm trumps the data that it accesses. Google uses machine learning to filter out spam messages from Gmail. Facebook trained computers to identify specific human faces nearly as accurately as humans do. Deep learning is used by Netflix and Amazon to decide what you want to watch or buy next.
What about Roboadvisors and robotraders in the finance industry, chatbots and voicebots (a.k.a conversational AIs), and personal buying assistants in retail and medical diagnostics, remote patient monitoring and AI tutors for personalized education? This demonstrates how pervasive AI and ML have already become in our lives.
Let's define these buzzwords…
At its simplest form, if I were to create three concentric circles to define AI, ML and DL, it may look like this – ML is a subset of AI and DL is a subset of ML. AI is basically the superset that covers both ML and DL. While they are related, they are definitely not the same and should not be used interchangeably.At its simplest form, if I were to create three concentric circles to define AI, ML and DL, it may look like this – ML is a subset of AI and DL is a subset of ML. AI is basically the superset that covers both ML and DL. While they are related, they are definitely not the same and should not be used interchangeably.
Artificial Intelligence – It is the broad umbrella term that attempts to make computers think the way humans think, and machines simulate the kinds of things that humans do and ultimately solve problems in a better and faster way than we do. The AI itself is a rather generic term for solving tasks that are easy for humans, but hard for computers. AI is not new – what is new is his sudden application.
Artificial Intelligence is often classified into two fundamental groups – applied or general. Applied AI is far more common – systems designed to intelligently trade stocks and shares, or maneuver an autonomous vehicle would fall into this category. Generalized AI– systems or devices that can in theory handle any task – is less common, but this is where some of the most exciting advancements are happening today. It is also the area that has led to the development of machine learning. Often referred to as a subset of AI, it’s really more accurate to think of ML as the current state-of-the-art.
Machine Learning - At its most basic, ML is the practice of using algorithms to parse data, learn from it, and then make a determination or prediction about something in the world. So rather than hand-coding software routines with a specific set of instructions to accomplish a particular task, the machine is “trained” using large amounts of data and algorithms that give it the ability to learn how to perform the task. Two important breakthroughs led to the emergence of machine learning as the vehicle which is driving AI development forward at its current speed. One of these was the realization – credited to Arthur Samuel in 1959 – that rather than teaching computers everything they need to know about the world and how to carry out tasks, it might be possible to teach them to learn for themselves. The secondwas the emergence of the internet, and the huge increase in the amount of digital information being generated, stored, and made available for analysis. Once these innovations were in place, engineers realized that rather than teaching computers and machines how to do everything, it would be far more efficient to code them to think like human beings, and then plug them into the internet to give them access to all of the information in the world.
The development of neural networks has been key to teaching computers to think and understand the world in the way we do, while retaining the innate advantages they hold over us such as speed, accuracy and lack of bias. A neural network is a computer system designed to work by classifying information in the same way a human brain does. It can be taught to recognize, for example, images, and classify them according to elements they contain. Essentially it works on a system of probability – based on data fed to it, it is able to make statements, decisions or predictions with a degree of certainty. The addition of a feedback loop enables “learning”; by sensing or being told whether its decisions are right or wrong, it modifies the approach it takes in the future.
Simply put, humans can expand their knowledge to adapt to the changing environment. To do so they must “learn.” Learning can be simply defined as the acquisition of knowledge or skills through study, experience, or being taught. Although learning is an easy task for most people, to acquire new knowledge or skills from data is complicated for machines. Moreover, the intelligence level of a machine is directly relevant to its learning capability. The study of machine learning tries to deal with this complicated task. In other words, machine learning is the branch of artificial intelligence that tries to find an answer to this question: how can we make a computer learn?
Machine learning applications can read text and work out whether the person who wrote it is making a complaint or offering congratulations. They can also listen to a piece of music, decide whether it is likely to make someone happy or sad, and find other pieces of music to match the mood. In some cases, they can even compose their own music expressing the same themes, or which is likely to be appreciated by the admirers of the original piece. To this end, another field of AI – Natural Language Processing (NLP) – has become a source of hugely exciting innovation in recent years, and one which is heavily reliant on ML. NLP applications attempt to understand natural human communication ─ either written or spoken ─ and communicate with us in return using similar, natural language. ML is used here to help machines understand the vast nuances in human language.
Deep Learning - It is a subset of ML. It uses some ML techniques to solve real-world problems by tapping into neural networks that simulate human decision-making. Deep learning can be expensive, and requires massive datasets on which to train itself. That's because there are a huge number of parameters that need to be understood by a learning algorithm, which can initially produce a lot of false-positives. For instance, a deep learning algorithm could be instructed to "learn" what a cat looks like. It would take a very massive data set of images for it to understand the very minor details that distinguish a lion from, say, a cheetah or a panther or a fox. Essentially deep learning involves feeding a computer system a lot of data, which it can use to make decisions about other data.
Let's look at few practical examples using deep learning.
Navigation of self-driving cars – Using sensors and onboard analytics, cars are learning to recognize obstacles and react to them appropriately using deep learning.
Precision medicine – Deep learning techniques are being used to develop medicines genetically tailored to an individual’s genome.
Automated analysis and reporting – Systems can analyze data and report insights from it in natural sounding, human language, accompanied with infographics which we can easily digest.
Game playing – Deep learning systems have been taught to play (and win) games such as the board game Go, and the Atari video game Breakout.
AI mindmap is a good way to understand the various branches of AI. The following mindmap will give you the lay of the land - it's an emerging and disruptive area of computer science, and there is a lot of advanced research taking place, not only in academia but also in companies like Google, Amazon, IBM, eBay, Netflix and Nvidia, just to name a few.
Below is an AI evolution over time - thanks to NVidia Website for this picture.
AI in Healthcare and Clinical Industry:
Here is a brief account of ML in medicine:
Let’s review some emerging areas within healthcare and clinical research where AI is being used to drive down costs and improve efficiencies.
Personal health virtual assistant - Today when everyone from youngsters to seniors – all are hooked to the smartphone, most of them rely on their intelligent personal virtual assistants on their devices for their day to day needs. There are powerful systems with robust AI capabilities backing these virtual assistants such as Cortana and Siri. When combined with healthcare apps, these systems have the potential to deliver incredible value. Healthcare apps cannot just monitor patient health but also provide medication alerts, patient education material & analyze patient’s state of mind with human-like interactions. The evolution of AI as a personal assistant in future will have critical impact on better understanding patient issues & assisting patients as per their immediate requirements when clinical personnel are not accessible.
Advanced analytics and research - AI is not just limited to comprehending human commands & providing intelligent responses but is also crucial for advanced analytics & research in healthcare to provide concrete diagnosis & recommend personalized treatments.
Machine learning is a rapidly blossoming area of AI that provides computers- the ability to learn without being explicitly programmed. For instance, ML uses intelligent image processing in oncology to detect abnormalities in test reports. Another fast-evolving AI arena is Natural language processing (NLP), which is the ability of a computer program to understand unstructured spoken or written inputs. NLP significantly helps in deciphering physician’s notes & narratives & building electronic health records.
Similarly, AI is being used to conduct complex computing in genomics & helping doctors to provide highly customized treatments to patients in the field of precision medicine. Moreover, AI has enabled physicians to improve treatment regimen by applying patient behavioral insights, based on their social media activities.
Personal life coach - Just providing appropriate treatments & medication is not sufficient, more & more healthcare providers are recommending patients to maintain contact with the physicians even outside the clinic or hospital. Many hospitals have started providing life coaching as a part of their total care programs, but due to dwindling reimbursements in current times, it’s hard to extend such programs.
But future seems bright with advanced AI capabilities & mobile apps that will make it easy for the patients to receive feedback on various heath data elements captured on their phone or wearable devices. So be it adherence to medication or physician’s recommendations or encouraging fitness activities & heathy habits. As a personal life coach, AI has the capability to provide highly personalized experience for every patient & generate proactive alerts to be sent to the physicians regularly.
Healthcare bots - AI is gaining significance in the area of customer service too and healthcare bots will be available commonly in future to make engagement with patients easier & more effective. An AI application, bot has the capability to intelligently interact with the patients through a chat window on a website or via telephone facilitating them with their requests.
Bots are primarily being used for scheduling follow-up appointments with the patient’s healthcare provider online. But bots can be used in various situations such as helping the patient with their medication or billing needs. Bots can significantly reduce the overall administrative costs for hospitals & improve customer service by offering 24*7 assistance for patient requests such as scheduling, billing & other clinical requests.
AI is poised to have a broad range of applications across the healthcare spectrum, including patient engagement & patient relations, chronic disease management, clinical decision support, sophisticated big data mining & analytics for diagnostics and population health management. So Artificial intelligence can not only improve care delivery, but also provide support in clinician decision-making and operational efficiency, enhancing the reach of healthcare providers.
How to build an AI strategy and data science team: Lessons from the trenches
As Chief Data Officer, I always wear at least two hats ―a business hat and a technology hat ― when I make a decision. When I wear my business hat, I don't need to be a mathematics genius or have a PhD in software engineering to make sense of AI for our business at ERT, for example. Instead, I must be able to draw on the experience and expertise I already have for our business domain, in two key ways. First, I must be able to assess which business outcomes would benefit most from AI, ML and DL. Second, I can always evaluate AI as simply the latest advanced analytical technology that might help achieve those outcomes. Given the rapid adoption of cloud and commodity hardware along with open source tools and technologies, I don't have to make massive investments in infrastructure and personnel in order to start applying AI's potentially transformative technologies to our business. Sounds simple – doesn’t it? But creating an AI strategy for the enterprise and then executing it is no small endeavor. My fellow data scientists will attest to this in a heartbeat. While this is no small task, this is not insurmountable either. Below I lay out a step-by-step process as to how to put together an actionable strategy that will for AI, ML and DL.
Step 1: Define and articulate AI strategy and its impact on business:
Step 2 : Develop clear line of sight of business value - Start by assessing the relevance of AI to your most important business outcomes and how it can fuel new data-driven capabilities, as well as in relation to specific operational and IT challenges. Many organizations become enamored with AI capabilities, but in the process they fail to determine the most strategic value drivers. AI creates the potential for data-driven business strategies. This makes data and analytics a primary driver of strategy, which in turn mandates a more expansive examination of the potential for AI. Prepare for the organizational, governance and technological challenges imposed by AI. Focus on developing a data-driven culture, data science skills and the ability to “speak data” from a business perspective. Be mindful of regulatory and ethical considerations for AI-driven product and applications.
Step 3: Identify the Business Use Case for AI
- Focus on WHY, then WHAT and then HOW in that order
- Define the problem; Define the Use Case
- Define how you are going to measure AI success (new product, new app, customer experience, automation, efficiency)
- Define MVP (Minimum viable product) and iterate over it quickly
- Look for early wins – win often
- Fail fast – try out alternative – don’t be stubborn
A high level AI development process looks something like this. Detail may vary but the process flow captures the high level constructs.
Up until now, we covered definitions of AI, ML and DL - the secret sauce is Data science.
Data Science :
Data Product:
Data Products, derived from the principles of data science, tie all of them together and create AI-driven data products for data monetization. Data Science, as you will see below, is an interdisciplinary discipline. Not only does the Data Science team need to have AI, ML and DL experience, it needs to have knowledge around data, programming, automation, cloud and big data - knowledge of statistics and algorithm are not enough and not nearly even sufficient. From my experience, most folks miss this salient point all the time. In order to build a data science team, one needs to build a powerful cross-functional team that consists of not only data scientists, but also statisticians, computer programmers, automation engineers, cloud engineers, data integration engineers, business analyst, business domain experts, data quality analyst, just to name a few. If I were to summarize a good data science team, it may look something like the picture below - data science is a cross section of so many fields. In order for to be successful, one must understand this - let's not just go and hire data scientists and think we have a data science team - we don't.
Data Science and the data products that get created by applying the principles of data science follow a very strict and regimented process. High level process may look something like the following:
Artificial Intelligence tools, technologies and platforms:
Practical machine learning development has advanced at a remarkable pace. This is reflected by not only a rise in actual products based on, or offering, machine learning capabilities but also a rise in new development frameworks and methodologies, most of which are backed by open-source projects.
Practical advice - developers and researchers beginning a new project can be easily overwhelmed by the choice of frameworks offered out there. These new tools vary considerably — and striking a balance between keeping up with new trends and ensuring project stability and reliability can be hard.
Software libraries
- Deeplearning4j, an open-source, distributed deep learning framework written for the JVM.
- Mahout, a library of scalable machine learning algorithms.
- OpenNN, a comprehensive C++ library implementing neural networks.
- TensorFlow, an open-source software library for machine learning.
- Torch, an open-source software library for machine learning.
- pyTorch, an open-source Tensors and Dynamic neural networks in Python
GUI frameworks
- Neural Designer, a commercial deep learning tool for predictive analytics.
- Neuroph, a Java neural network framework.
- OpenCog, a GPL-licensed framework for artificial intelligence written in C++, Python and Scheme.
- RapidMiner, an environment for machine learning and data mining, now developed commercially.
- Weka, a free implementation of many machine learning algorithms in Java.
Cloud services
- Data Applied, a web based data mining environment.
- Grok, a service that ingests data streams and creates actionable predictions in real time.
- Microsoft Cognitive Services, cloud-based APIs that you can embed into your apps for computer vision, NLP, search, and more.
- Watson, a pilot service by IBM to uncover and share data-driven insights, and to spur cognitive applications.
Conclusion :
“Algorithm marketplaces are similar to the mobile app stores that created the ‘app economy,'” Alexander Linden, research director at Gartner said. “The essence of the app economy is to allow all kinds of individuals to distribute and sell software globally without the need to pitch their idea to investors or set up their own sales, marketing and distribution channels.”
All the data in the world isn’t very useful if you can’t leverage it. Algorithms are how you efficiently scale the manual management of business processes. With the purposeful use of Artificial Intelligence, all healthcare stakeholders benefit. AI algorithms that solve specific problems that translate into actions – will be the secret sauce of successful organizations in the future.
AI is a matured domain. Given the innovations in cloud computing and easy availability of commodity hardware with plenty of open source tools, technologies and software systems, AI is fast becoming mainstream. These days, any start up with a very minimum amount of investment can start thinking about creating AI-driven data products that can potentially disrupt an entire industry in no time.
Let me conclude with the AI mind-map again, just to remind you all that Artificial Intelligence is a very large and deep discipline and multifaceted - This is a great time to be a Data Scientist given all the innovations around tools, computing, technologies, open source revolution - what a time to be a part of this algorithm journey.
Senior Manager Innovation & Delivery | Driving Strategic Technology Solutions
7 年I hear about this all the time! Great point of view on AI and data science .
Chief Executive Officer at BGB Group
7 年Brilliant work Dr. Santi.