Artificial Intelligence = Statistics without Bounds
Early statisticians were limited to the calculating tools of the day – pen and paper and if they were lucky, simple calculating machines. They developed appropriately simple models to work their magic. Modern computers unleashed the potential for much more powerful extraction of information from datasets. New computer models evolved – models of inconceivable computational intensity – to discover truths heretofore hidden in the data. We called these artificial intelligence. Artificial intelligence recognizes a triad of learning paradigms, each with its own distinct approach to unraveling the mysteries of data.
First among these is unsupervised learning, a process akin to an alchemist’s endeavor. Here, a computer is presented with a vast and uncharted expanse of data, a digital wilderness, and is tasked with finding patterns and regularities, akin to a cartographer mapping unknown territories. This is the AI equivalent of exploratory data analysis in statistics, where the goal is ‘feature extraction’ – to uncover the real-world phenomena that the dataset encapsulates. Pure data, in its raw form, is like an uncut gemstone, holding potential but not immediately useful. The aim is to distill this raw material into information, transforming measurements and knowledge about real-world entities into a form that can be used to answer questions. This initial step of feature extraction is akin to the first stroke of the artist’s brush, setting the stage for the masterpiece of understanding that is to follow.
The second paradigm is supervised learning, a process that builds upon the features unearthed by its predecessors. In this realm, the dataset is not only a collection of features but also a repository of labels, often painstakingly provided by legions of humans, linking data features to tangible realities. This ‘training dataset’ serves as a guide, instructing the AI on the truths of the world. The AI is then tasked with making predictions about new, unseen data, much like a detective piecing together clues to solve a mystery. For instance, in the case of a car theft, an AI might analyze a lineup of suspects, each labeled with their past behaviors, to predict the perpetrator. Though not infallible, with a wealth of data, these predictions can achieve remarkable accuracy. Yet, the vast datasets required for such feats are a rarity, making the prowess of large AI language models like ChatGPT an exceptional case in the AI landscape.
Finally, there is reinforcement learning, a journey of trial and error within an environment of rewards and punishments. Here, the AI is a learner, an explorer, seeking the optimal path to minimize penalties and maximize gains. A notable form of this is adversarial AI, where two copies of an AI engage in a duel, each striving to outwit the other in solving a problem or winning a game. In their zenith, AIs like AlphaGo transcend human abilities, conquering challenges like the ancient game of Go, once deemed a fortress impregnable to all but human minds. Similarly, AlphaFold, a kin in this lineage, achieved a feat once relegated to the distant future – the identification of nearly all proteins in the human body, unveiling the building blocks of life itself.
Thus unfolds the saga of artificial intelligence, a story told through the narratives of unsupervised, supervised, and reinforcement learning, each contributing to a grand design of understanding and interacting with our world, powered by the raw material of data.
领英推荐
In the ever-evolving landscape of artificial intelligence, a symphony of data is being orchestrated into decisions and estimations. This intricate dance mirrors the steps of statistical decision models, with AI drawing deeply from the well of statistical language. The pioneering work of Pearson and Fisher, with their ingenious shortcuts for simplifying complex calculations, laid the groundwork. They harnessed the power of differential calculus optimization, their primitive mathematics bending linear solutions to their will under the guiding principle of least squares loss.
As we ventured into the past decade, a new era dawned. The marriage of calculus with the might of computational power and the wizardry of backpropagation unleashed a realm of possibilities. Optimization procedures have evolved, replacing the simplicity of calculus minimization with the finesse of backpropagation and the astute navigation of gradient descent search. No longer are we constrained by primitive squared error loss. Humans can feed an AI any loss function they think is convenient, and expect it to compute an answer.
No longer confined to the constraints of traditional risk minimization, AI now explores the vast expanse of statistical frameworks with a bold spirit of innovation. At the heart of this exploration lie the activation functions, the very sinews that carry the lifeblood of data through the neural network’s veins.
Yet in correcting the biases of statistical simplifications, we too often introduce new biases. This is especially true in large language models like ChatGPT. Here, subtle social biases are woven into the fabric of AI responses, demanding a departure from traditional closed-form loss functions towards the flexibility of loss algorithms, engineering of activation functions and curation of datasets. Data curation has blossomed, embracing the richness of non-quantitative data such as text and images. Techniques like one-hot encoding, unimaginable without the muscle of massive computing power, have become the norm. Like many other adaptations in AI, the seeds of these methods were sown as far back as the 1940s by pioneers like I.J. Good, who explored multinomial-Dirichlet Bayesian families of probability distributions in applications akin to one-hot encoding. In this symphony of innovation, the interpretation of AI’s output is a crucial thread, weaving together the insights gleaned from the data with the objectives that guide our journey. As we continue to unravel the mysteries of artificial intelligence, we stand on the shoulders of these statistical giants, pushing the boundaries of what is possible, one algorithm at a time.
The interpretation of artificial intelligence output can often seem like a riddle wrapped in a mystery. Yet, amid this complexity, data scientists stand as intrepid explorers, not content to leave the ‘black box’ of AI unexplored, but are challenged to interpret the mental machinations of AI in human ways, ensuring that the fruits of AI’s digital alchemy are not just accessible, but meaningful to the very humans who seek to harness them.