登录查看更多内容

The No Free Lunch Theorem

Mansoor Ahmed

BSc. at University of Engineering and Technology, Lahore

发布日期: 2021年6月4日

Introduction

Learning theory claims that a machine learning algorithm can generalize well from a finite training set of examples. This seems to contradict some basic principles of logic. generalization, or inferring general rules from a limited set of examples, isn't logically valid. To logically infer a rule describing every member of a group, one must have information about every member of that set. In part, machine learning avoids this problem by offering only probabilistic rules, instead of the entirely certain rules utilized in purely logical reasoning. Machine learning promises to seek out rules that are probably correct about most members of the set they concern.

Description

Unfortunately, even this doesn't resolve the whole problem. The no free lunch theorem for machine learning (Wolpert, 1996 ) states that averaged over all possible data generating distributions, every classification algorithm has an equivalent error rate when classifying previously unobserved points. In other words, in some sense, no machine learning algorithm is universally any better than the other. the foremost sophisticated algorithm we will imagine has an equivalent average performance (overall possible tasks) as merely predicting that each point belongs to an equivalent class.

Fortunately, these results hold only we average over all possible data generating distributions. If we make assumptions about the sorts of probability distributions we encounter in real-world applications, then we will design learning algorithms that perform well on these distributions. This suggests that the goal of machine learning research isn't to hunt a universal learning algorithm or absolutely the best learning algorithm. Instead, our goal is to know what sorts of distributions are relevant to the “real world” that an AI agent experiences, and what sorts of machine learning algorithms perform well on data drawn from the sorts of data generating distributions we care about.

Regularization

The no-free lunch theorem implies that we must design our machine learning algorithms to perform well on a selected task. We do so by building a group of preferences into the training algorithm. When these preferences are aligned with the training problems we ask the algorithm to unravel, it performs better. So far, the sole method of modifying a learning algorithm we've discussed is to extend or decrease the model’s capacity by adding or removing functions from the hypothesis space of solutions the training algorithm is in a position to settle on. We gave the specific example of accelerating or decreasing the degree of a polynomial for a regression problem. The view we've described thus far is oversimplified. The behavior of our algorithm is strongly affected not just by how large we make the set of functions allowed in its hypothesis space, but by the precise identity of these functions. the training algorithm we've studied thus far, rectilinear regression, features a hypothesis space consisting of the set of linear functions of its input.

These linear functions are often very useful for problems where the connection between inputs and outputs truly is on the brink of linear. they're less useful for problems that behave in a very nonlinear fashion. for instance, rectilinear regression wouldn't perform all right if we tried to use it to predict sin(x) from x. we will thus control the performance of our algorithms by choosing what quiet functions we allow them to draw solutions from, also as by controlling the quantity of those functions. we will also provide a learning algorithm a preference for one solution in its hypothesis space to a different. this suggests that both functions are eligible, but one is preferred. The unpreferred solution is chosen as long as it fits the training data significantly better than the well-liked solution. for instance, we will modify the training criterion for rectilinear regression to include weight decay. To perform rectilinear regression with weight decay, we minimize the sum comprising both the mean squared error on the training and a criterion J (w) that expresses a preference for the weights to possess a smaller squared L norm. Specifically,

J( w ) = MSE + λw w,

where λ may be a value chosen before the time that controls the strength of our preference for smaller weights. When λ = 0, we impose no preference, and bigger λ forces the weights to become smaller. Minimizing J (w ) leads to a choice of weights that make a tradeoff between fitting the training data and being small. this provides us solutions that have a smaller slope or put weight on fewer of the features. In our weight decay example, we expressed our preference for linear functions defined with smaller weights explicitly, via an additional term within the criterion we minimize. There are many other ways of expressing preferences for various solutions, both implicitly and explicitly. Together, these different approaches are referred to as regularization.

Regularization is one among the central concerns of the sector of machine learning, rivaled in its importance only by optimization. The no free lunch theorem has made it clear that there's no best machine learning algorithm, and, especially, no best sort of regularization. Instead, we must choose a sort of regularization that's well-suited to the actual task we would like to unravel. The philosophy of deep learning generally and this book, especially, is that a really wide selection of tasks (such as all of the intellectual tasks that folks can do) may all be solved effectively using very general-purpose sorts of regularization.

https://www.technologiesinindustry4.com/2021/05/no-free-lunch-theorem.html

要查看或添加评论，请登录

Mansoor Ahmed的更多文章

Building a Sustainable Future for the Textile Industry

2023年7月16日

Building a Sustainable Future for the Textile Industry

Introduction The textile industry is one of the largest and most influential sectors in the world, playing a…
Discovering the Potential of Sea-Based Floating Solar Power Plants

2023年7月12日

Discovering the Potential of Sea-Based Floating Solar Power Plants

Introduction: The quest for renewable energy sources has led to remarkable advancements in solar power technology…
The Transformation of Renewable Energy Technologies

2023年7月12日

The Transformation of Renewable Energy Technologies

Introduction In recent years, the global landscape has witnessed a remarkable transformation in the field of renewable…
Twitter vs Meta Threads: The Battle for Online Conversation Dominance

2023年7月7日

Twitter vs Meta Threads: The Battle for Online Conversation Dominance

Introduction In the vast realm of social media, platforms continue to vie for supremacy in capturing the attention and…
Meta Platforms | Social Metaverse Company

2022年11月10日

Meta Platforms | Social Metaverse Company

Introduction Meta Platforms. Inc performing business as Meta and in the past named Facebook, Inc.
Automated Market Maker (AMM) Mechanism

2022年11月1日

Automated Market Maker (AMM) Mechanism

Introduction Automated market makers (AMMs) permit the virtual property to be traded without permission and robotically…
Top Pillars of Industry 4.0

2022年9月29日

Top Pillars of Industry 4.0

Introduction Industry 4.0 is the stylish call particular to the fourth Industrial revolution.
Piecework and Assembly Line Industry 2.0

2022年9月19日

Piecework and Assembly Line Industry 2.0

Introduction The Second Industrial Revolution started in the 19th century over the discovery of electricity and…
Characteristics and Impacts of Industry 4.0

2022年9月14日

Characteristics and Impacts of Industry 4.0

Introduction The waves of the Industry 4.0 model in the global and national economies, specific industries, employment,…
What Are Stable coins?

2022年6月23日

What Are Stable coins?

Introduction A a stable coin is a digital asset that objectives to uphold the same value as a stable asset. The US…

See all articles

The No Free Lunch Theorem

Mansoor Ahmed

BSc. at University of Engineering and Technology, Lahore

Introduction

Description

Regularization

Mansoor Ahmed的更多文章

社区洞察

其他会员也浏览了

What Are Learning Algorithms?

Understanding Machine Learning: Concepts, Methods, and Challenges

What is Machine Learning?

What is Machine Learning? A Beginner's Guide to Understanding the Basics

Why Machines Learn? purpose and process

Deep Learning across label confidence distributions

The Easiest way to understand Machine Learning

Hyperparameter optimization in Machine Learning Part-1: Algorithms

Demystifying Machine Learning :)

The Dawn of Machine Learning: A Historical Overview

Introduction

Description

Regularization

Mansoor Ahmed的更多文章

Building a Sustainable Future for the Textile Industry

Discovering the Potential of Sea-Based Floating Solar Power Plants

The Transformation of Renewable Energy Technologies

Twitter vs Meta Threads: The Battle for Online Conversation Dominance

Meta Platforms | Social Metaverse Company

Automated Market Maker (AMM) Mechanism

Top Pillars of Industry 4.0

Piecework and Assembly Line Industry 2.0

Characteristics and Impacts of Industry 4.0

What Are Stable coins?

社区洞察

其他会员也浏览了

What Are Learning Algorithms?

Understanding Machine Learning: Concepts, Methods, and Challenges

What is Machine Learning?

What is Machine Learning? A Beginner's Guide to Understanding the Basics

Why Machines Learn? purpose and process

Deep Learning across label confidence distributions

The Easiest way to understand Machine Learning

Hyperparameter optimization in Machine Learning Part-1: Algorithms

Demystifying Machine Learning :)

The Dawn of Machine Learning: A Historical Overview