登录查看更多内容

Machine Learning - Some basic definitions

?? Fernando Bucci

Head of Gen AI for Software Engineering

发布日期: 2017年12月8日

Machine learning is a branch in computer science that studies the design and use of algorithms and models that can learn patterns in data and then make predictions without human intervention when similar patterns are found in new data.

Methods

Algorithms are categorized in methods. These are the most important ones:

Supervised learning algorithms are trained using past data with labeled examples to predict labels in future data. For example, we could have data points for flowers, each one labeled with the species it belongs to, along with other features such as petal size. The algorithm reads the input set and learns from it, updating the model. Then, the algorithm uses the information in the model to predict the species a new flower belongs to.
Unsupervised learning is used against data that does not have labels. The algorithms explore the data and find some structure. For example, we could have data points for flowers. The algorithm would read the input set, learn from it and suggest a set of species for classifying flowers into groups.
Reinforcement machine learning algorithms interact with its environment by producing actions and learning from the results. For example, drones may learn to fly by trail and error.

Problems

Labels define the problem. as follows:

Categorical and discrete variables can take one of a limited number of possible values. Predicting categorical labels is called classification. For example, the species a flower belongs to.
Continuous or real variables can take any real value. Predicting quantitative labels is called regression. For example, the petal width a flower taking into account the species and the petal length.

Features

In machine learning, statistical variables are called features.

Features can be of any of the following types:

Numeric: describe a quantity as a number. Subtypes are continuous and discrete.

- Continuous: observations can take any numeric value in a range of real values. Examples include length, width and time.

- Discrete: observations can take any numeric value in a set of numeric values. A discrete variable cannot take the value of a fraction between one value and the next closest value in the set. Examples include number of flowers and number of passengers.

Categorical: Describe a quality or characteristic. Subtypes are ordinal and nominal.

- Ordinal: observations can be ordered. Examples include t-shirt size (e.g. XL, L, M, S, XS) and satisfaction grades(e.g. high, medium, low)

- Nominal: observations cannot be ordered. Examples include species, sex, brand, etc.

Formal and detailed classification of statistical variables according to the nature of the information they represent is out of the scope of this article. However, here is a valuable resource if you are interested: Statistical data types.

Algorithms

Most frequently used algorithms are listed below.

Supervised learning algorithms

Linear regression algorithms for regression and classification (e.g. Fisher's linear discriminant analysis (LDA), logistic regression, naive Bayes, Winnow, perceptron)
Non-linear regression algorithms for regression and classification
Linear and non-linear support vector machine (SVM)
Learning vector quantization (LVQ)
Classification and regression trees (CART)
K-nearest neighbours (KNN)
Neural networks

Unsupervised learning algorithms

Apriori
K-means clustering
Principal Component Analysis (PCA)

Ensemble learning techniques

Random forest
AdaBoost

A more comprehensive list of algorithms available in caret package for R can be found here.

Well, enough theory for today. In the following articles we will play with these algorithms using R.

要查看或添加评论，请登录

?? Fernando Bucci的更多文章

Pensando en colores

2023年2月19日

Pensando en colores

En este artículo te contaré cómo, aún hoy, nos seguimos perdiendo en los más básicos razonamientos, cuáles son algunos…

1 条评论
Sustainable IT (I)

2022年12月15日

Sustainable IT (I)

This is the first of a series of articles whose goal is to provide an introduction to the concept of Sustainable IT…
API Design Patterns

2022年11月8日

API Design Patterns

APIs bring significant benefits when used in different scenarios. In this article, the most relevant kinds of scenarios…
Why strategy gurus have lied to us for decades and the truthful truth

2020年10月2日

Why strategy gurus have lied to us for decades and the truthful truth

You must have already pitched upon several strategy experts and gurus explaining with pride the process for defining…

3 条评论
What if Histiaeus used WhatsApp?

2018年1月12日

What if Histiaeus used WhatsApp?

Steganography is the practice of concealing the fact that a secret message is being sent as well as the contents of the…
Notes on Hack the Box

2018年1月11日

Notes on Hack the Box

Hack the Box is an online platform allowing to you test your penetration testing skills. The first challenge you face…

1 条评论
Machine Learning - Supervised Learning - Classification (I)

2017年12月8日

Machine Learning - Supervised Learning - Classification (I)

In this article we will use classification algorithms to predict the species flowers belongs to by knowing petal and…
Machine Learning - Data visualization with R (III)

2017年12月7日

Machine Learning - Data visualization with R (III)

This article continues presenting different techniques that can be used to communicate data or information by encoding…
Machine Learning - Data visualization with R (II)

2017年12月7日

Machine Learning - Data visualization with R (II)

This article continues presenting different techniques that can be used to communicate data or information by encoding…
Machine Learning - Data visualization with R (I)

2017年12月6日

Machine Learning - Data visualization with R (I)

This article presents different techniques that can be used to communicate data or information by encoding it in…

See all articles

Machine Learning - Some basic definitions

?? Fernando Bucci

Head of Gen AI for Software Engineering

Methods

Problems

Features

Algorithms

Supervised learning algorithms

Unsupervised learning algorithms

Ensemble learning techniques

?? Fernando Bucci的更多文章

社区洞察

其他会员也浏览了

Which Machine Learning Algorithm to Use

Challenges in Implementing Machine Learning Projects

Common machine Learning Algorithms

Unleashing the Power of Machine Learning Algorithms: A Comprehensive Guide

Machine Learning Algorithms: A Deep Dive into Key Techniques

Breaking Down Machine Learning Algorithms: A Beginner’s Guide to Linear Regression

Unleashing the Power of Big Data: A Comprehensive Look at Machine Learning Algorithms