From Statistics to Deep Learning: The Evolution of AI Explainability
Sebastian Obeta
Digital Transformation Leader | Artificial Intelligence Catalyst | Process Improvement & Operational Effectiveness | Intersection of Technology, Humanity, & AI Ethics Researcher | Speaker | Advisory Board Member.
Artificial intelligence (AI) has undergone significant evolution, moving from conventional statistical approaches to intricate deep learning models. As AI systems increasingly permeate various aspects of our daily lives, the imperative to comprehend and elucidate their decision-making processes has become more pronounced.
This shift in emphasis from opaque models to transparent and interpretable AI has given rise to the burgeoning field of AI explainability.
Reflecting on the early days of AI reveals the dominance of statistical methods and rule-based systems. These initial models were often straightforward and interpretable, facilitating human understanding of the rationale behind their decisions. This article delves into the journey from statistical models to deep learning, exploring how the pursuit of AI explainability has progressed alongside these advancements. The historical development of statistics indicates a trajectory towards clearer and more transparent analytical frameworks.
Nonetheless, instances exist where the explainability of statistical models has been restricted, potentially leading to misinterpretations and misapplications.
A comprehensive understanding of the historical challenges to explainability is crucial for effectively addressing contemporary issues associated with AI models.
Explaining Statistics: A Historical Perspective.
Statistics has a rich history that has evolved over time to meet the needs of society. The term “statistics” denoted the systematic collection of demographic and economic data by states in the 18th century. For at least two millennia, these data were mainly tabulations of human and material resources that might be taxed or put to military use. From its humble beginnings as a tool for government administration to its modern-day applications in science, business, and technology, the evolution of statistics has been shaped by the contributions of numerous thinkers and practitioners.
The roots of statistics can be traced back to ancient civilizations, where rulers and leaders recognised the importance of data in making informed decisions. The ancient Egyptians, for example, gathered data on agricultural production and population size for efficient resource allocation. Similarly, the Greeks, with their emphasis on observation and measurement, laid the groundwork for the systematic collection of data.
In the early 19th century, collection intensified, and the meaning of “statistics” broadened to include the discipline concerned with the collection, summary, and analysis of data.
Hence Statistics, the science of collecting, analysing, interpreting, presenting, and organising data, has a rich and fascinating history that spans centuries.
Traditional Statistical Models.
The inception of probability, a foundational concept in statistics, unfolded during the 17th century, French mathematicians and philosophers Blaise Pascal and Pierre de Fermat initiated a correspondence on games of chance, laying the groundwork for probability theory, a pivotal element in statistical analysis.
In the Enlightenment era of the 18th century, statisticians began applying mathematical principles to social phenomena. Sir Francis Galton, a cousin of Charles Darwin, played a significant role in developing correlation and regression analysis to study trait heritability. His contributions became foundational for modern statistical methods and the emergence of the field of eugenics.
The 19th century witnessed the establishment of statistical societies and associations, promoting collaboration among statisticians and the exchange of ideas.
The Royal Statistical Society (RSS), founded in London in 1834, became a prominent hub for discussions on theories, methodologies, and applications. Other societies, including the American Statistical Association - ASA , were subsequently formed to advance the discipline.
The late 19th and early 20th centuries saw the evolution of inferential statistics, involving predictions or inferences about a population based on data samples. Sir Ronald A. Fisher, often considered the father of modern statistics, made ground-breaking contributions to experimental design and hypothesis testing, laying the foundation for the widely used p-value in scientific research.
The latter half of the 20th century witnessed a statistical revolution with the advent of computers. The capability to process large datasets and perform complex calculations accelerated the growth of statistical methods.
Techniques such as Bayesian statistics, machine learning, and data mining gained prominence, expanding the scope of statistical applications across diverse fields.
As data volume and complexity increased, traditional statistical methods yielded machine learning (ML) algorithms. Traditional statistical methods faced limitations in handling complex and unstructured data, hindering their ability to capture intricate patterns and relationships. ML algorithms, particularly in supervised learning, demonstrated remarkable proficiency in learning patterns from data and making predictions.
The Statistical Symphony of ML.
Machine learning algorithms, including deep learning algorithms, often trace their origins to statistical concepts. They evolve by adapting statistical principles to address intricate problems and make predictions based on data. The formulas and methodologies employed in these algorithms are shaped by statistical theories and practices.
Linear Regression:
y = mx + b,
where y represents the target variable, x is the input feature, m denotes the slope, and b stands for the intercept. Machine learning extends this concept to multiple dimensions, accommodating multiple features.
Logistic Regression:
During my Data Science studies at London South Bank University , where I graduated with distinction, I was instructed in Statistical analysis and modelling module ?by Dr Christos Chrysoulas , and by the way, this article draws inspiration from that class. My recollection of logistic regression as a probability algorithm stems from a rather amusing association. The impetus for this connection lay in my encounter with the word 'probability' across three different modules taught by world-class experts. These modules included Machine Learning by Dr Enrico Grisan , Data Mining by Dr Daqing Chen , and the Statistical Analysis and Modelling Module by Dr Christos Chrysoulas . Faced with this recurrent theme (logistics and probability), I found myself inclined to adopt the logistics regression to be probability algorithm, and we can see the ?concept dating back to the 19th century.
Statistical Roots: Despite its nomenclature, logistic regression is employed for classification problems, modelling the probability of an instance belonging to a specific class.
Formula Evolution: The logistic regression model relies on the logistic function (sigmoid function), which transforms a linear combination of input features into a range between 0 and 1. The formula is expressed as
P(y=1) = 1 / (1 + e^(-z)),
where z represents a linear combination of input features.
Decision Trees:
I recall one of my mentees in Applied Artificial Intelligence Society (University of Bradford) observing my passionate discussions about the decision tree algorithm during our community of practice meeting. Curious about my enthusiasm, they asked why I was so obsessed with it. I explained that Dr Christos Chrysoulas, the world-class expert who taught Statistical analysis and modelling at London South Bank University , had assigned me the task of learning and presenting the decision tree algorithm to my classmates and other lecturers. With no alternative, I immersed myself in learning and later stood before 25 students to discuss decision trees. I felt fulfilled as Dr Christos Chrysoulas, posed questions, to which I responded confidently with only a few corrections, ultimately excelling in the module. The cover page of my presentation is shared below.?
领英推荐
Statistical Roots: Decision trees are grounded in statistical concepts such as entropy and information gain. They iteratively split the data based on features that maximise information gain.
Formula Evolution: The algorithm progresses by selecting optimal features to split the data at each node, striving for the most informative divisions. This recursive process continues until a predefined stopping criterion is met.
Support Vector Machines (SVM):
Neural Networks (Deep Learning):
K-Nearest Neighbours (KNN):
Understanding the statistical symphony of machine learning reveals that statistical concepts can be intricate and, at times, challenging to grasp, leading to difficulties in comprehending the implications and applications of statistical analyses.
The importance of clear and intuitive explanations is crucial for ensuring the effective use of statistical methods.
Moreover, statistical data interpretation may introduce uncertainties and biases that need to be addressed to ensure accurate and reliable analysis.
The Rise of Deep Learning and Its Black Box Nature.
One significant challenge associated with traditional statistics lies in their potential struggle to accommodate non-linear relationships and interactions among variables. To emphasise this point, consider linear regression, which assumes a linear relationship between independent and dependent variables. A linear relationship implies that a change in one variable is consistently associated with a constant change in another.
However, numerous real-world phenomena exhibit non-linear patterns where the relationship between variables is not a straight line.
For instance, imagine a dataset representing the link between years of experience and salary. A linear model might presume a constant increase in salary for each additional year of experience. However, the actual relationship may not strictly adhere to linearity. It could be that initially, as experience grows, the salary increases more rapidly, but then the rate of increase slows down—a non-linear pattern challenging to capture effectively using traditional statistical models.
Another example involves interactions among variables. In a study exploring the impact of both study hours and sleep on exam performance, a traditional statistical model might assume the independence of the effects of study hours and sleep. However, in reality, there might be an interaction effect where the combined influence of study hours and sleep differs from the sum of their individual effects. For instance, students who study extensively but get very little sleep might perform worse than expected based on the individual effects of study hours and sleep.
To tackle these challenges, advanced statistical techniques and machine learning models come into play.
For non-linear relationships, methods like polynomial regression or non-parametric approaches (e.g., decision trees, random forests, or support vector machines) are more suitable and capable of capturing complex patterns beyond simple linearity.
For interactions among variables, models with interaction terms or more complex algorithms, such as neural networks, can be applied. These models possess the flexibility to learn from and capture intricate relationships and interactions in the data.
Examining the limitations of traditional statistical models is crucial for recognising the need for more advanced AI models. While powerful, models like support vector machines and decision trees lacked the depth and complexity required for certain tasks and data types, leading to the emergence of deep learning models.
Deep learning, a subset of machine learning inspired by the structure and function of the human brain, marked a paradigm shift in AI. Deep neural networks, with multiple layers, demonstrated superior performance in tasks such as image recognition, natural language processing, and speech recognition. These models could automatically learn hierarchical representations of data, capturing intricate features and nuances that are challenging for traditional methods.
However, despite their unparalleled accuracy, the complex architectures of deep learning models pose a challenge to understanding their decision-making processes.
This lack of transparency became a significant concern, particularly in applications where trust, accountability, and ethical considerations were crucial. Bridging the gap between accuracy and interpretability gave rise to the field of AI explainability.
Explainability in AI.
Explainability in AI refers to the ability to understand and interpret the decisions made by machine learning models. It has become a critical aspect, particularly in sensitive domains like healthcare, finance, and criminal justice. Various techniques have been developed to make AI models more interpretable, ranging from feature importance analysis and model-agnostic methods to the incorporation of explainable AI models themselves.
Techniques for Improving AI Explainability.
Researchers and practitioners have made significant strides in developing methods to interpret deep learning models. Techniques such as layer-wise relevance propagation, model agonistic techniques (generalised insights concept), feature Importance analysis (Highlighting influential variables) and attention mechanisms provide insights into which parts of the input data contribute most to the model's decision.
?Additionally, efforts are underway to design inherently interpretable neural network architectures, striking a balance between complexity and transparency.
Conclusion and Future Directions
The evolution of AI explainability from traditional statistics to deep learning reflects the ongoing effort to demystify complex models and make AI systems more accountable and trustworthy. As AI continues to advance, the integration of explainability techniques will play a crucial role in ensuring that these powerful tools are used responsibly and ethically. Balancing the trade-off between accuracy and interpretability remains a key challenge, but ongoing research and advancements in the field promise a future where AI can be both powerful and understandable.
?I'd love to hear your thoughts! Let's start a conversation in the comments section. Your feedback is valuable to me.
?
?
?
Senior Data Scientist | IBM Certified Data Scientist | AI Researcher | Chief Technology Officer | Deep Learning & Machine Learning Expert | Public Speaker | Help businesses cut off costs up to 50%
10 个月Fantastic journey into the evolution of statistics and its role in contemporary AI! Can't wait to read it. ??
Great article! The evolution of explainability in AI is fascinating. ??
Producing end-to-end Explainer & Product Demo Videos || Storytelling & Strategic Planner
10 个月Great article! Love the connection between the historical evolution of statistics and the complexities of AI. ??