登录查看更多内容

ML Algorithms equations made simple

Vishwajit Sen

Data Science / AI / ML / DL Senior Manager

发布日期: 2023年6月17日

Linear Regression:

Equation: y = β? + β?x? + β?x? + ... + β?x? + ?

Explanation: Linear regression models the relationship between dependent variable (y) and independent variables (x?, x?, ..., x?) by fitting a linear equation. The coefficients β?, β?, β?, ..., β? represent the intercept and slopes of the line, while ? denotes the error term.

Python Code:

from sklearn.linear_model import LinearRegression

# Create a Linear Regression model

model = LinearRegression()

# Fit the model to the training data

model.fit(X_train, y_train)

# Predict using the trained model

y_pred = model.predict(X_test)

Logistic Regression:

Equation: p(y=1|x) = 1 / (1 + exp(-z)), where z = β? + β?x? + β?x? + ... + β?x?

Explanation: Logistic regression models the probability of the dependent variable (y) belonging to class 1 given the independent variables (x?, x?, ..., x?). The logistic function transforms the linear combination of coefficients and input variables into a value between 0 and 1, representing the probability.

Python Code:

from sklearn.linear_model import LogisticRegression

# Create a Logistic Regression model

model = LogisticRegression()

# Fit the model to the training data

model.fit(X_train, y_train)

# Predict using the trained model

y_pred = model.predict(X_test)

Decision Trees:

Decision trees involve a tree-like model of decisions and their possible consequences. Each internal node represents a feature or attribute, each branch represents a decision rule, and each leaf node represents the outcome.

Python Code:

from sklearn.tree import DecisionTreeClassifier

# Create a Decision Tree model

model = DecisionTreeClassifier()

# Fit the model to the training data

model.fit(X_train, y_train)

# Predict using the trained model

y_pred = model.predict(X_test)

Random Forests:

Random forests are an ensemble learning method that combines multiple decision trees. Each tree is trained on a random subset of the training data, and the final prediction is determined by aggregating the predictions of individual trees.

Python Code:

from sklearn.ensemble import RandomForestClassifier

# Create a Random Forest model

model = RandomForestClassifier()

# Fit the model to the training data

model.fit(X_train, y_train)

# Predict using the trained model

y_pred = model.predict(X_test)

Support Vector Machines (SVM):

Equation: w?x + b = 0

Explanation: Support Vector Machines aim to find the hyperplane that separates the classes in the feature space with the maximum margin. The equation represents the decision boundary, where w is the weight vector, x is the feature vector, and b is the bias term.

领英推荐

?? How Autoformer Tackles Time Series Challenges in…

Kengo Yoda 2 个月前

A detailed K-nearest Neighbors classifier in Python

Leonardo A. 3 年前

Predictive Maintenance for Factories

Wilson K. 11 个月前

Python Code:

from sklearn.svm import SVC

# Create a Support Vector Machine model

model = SVC()

# Fit the model to the training data

model.fit(X_train, y_train)

# Predict using the trained model

y_pred = model.predict(X_test)

K-Nearest Neighbors (KNN):

Equation: y = mode(y?, y?, ..., y?)

Explanation: KNN classifies a new data point based on the majority class of its k nearest neighbors. The equation represents the majority voting process, where y represents the predicted class of the new data point.

Python Code:

from sklearn.neighbors import KNeighborsClassifier

# Create a K-Nearest Neighbors model

model = KNeighborsClassifier()

# Fit the model to the training data

model.fit(X_train, y_train)

# Predict using the trained model

y_pred = model.predict(X_test)

Naive Bayes:

Naive Bayes classifiers are based on Bayes' theorem and assume independence among features. There are different types of Naive Bayes classifiers, such as Gaussian Naive Bayes, Multinomial Naive Bayes, and Bernoulli Naive Bayes, each suited for different data types.

Python Code (Gaussian Naive Bayes):

from sklearn.naive_bayes import GaussianNB

# Create a Gaussian Naive Bayes model

model = GaussianNB()

# Fit the model to the training data

model.fit(X_train, y_train)

# Predict using the trained model

y_pred = model.predict(X_test)

Neural Networks:

Neural networks consist of interconnected nodes (neurons) organized in layers, where each neuron applies a nonlinear activation function to its inputs. The weights and biases of the neurons are learned during training.

Python Code (using Keras):

from tensorflow import keras

from tensorflow.keras import layers

# Create a Neural Network model

model = keras.Sequential()

model.add(layers.Dense(64, activation='relu', input_shape=(input_dim,)))

model.add(layers.Dense(64, activation='relu'))

model.add(layers.Dense(num_classes, activation='softmax'))

# Compile the model

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# Fit the model to the training data

model.fit(X_train, y_train, epochs=10, batch_size=32)

# Predict using the trained model

y_pred = model.predict(X_test)

要查看或添加评论，请登录

Vishwajit Sen的更多文章

Exploring new opportunities in Data Science

2023年10月25日

Exploring new opportunities in Data Science

Career Objective: Dedicated Data Science and Machine Learning Expert with a passion for driving innovation across…

1 条评论
Technical indicators in the stock market:

2023年10月7日

Technical indicators in the stock market:

Technical indicators in the stock market are mathematical calculations based on historical price, volume, or open…
Preparing data for a recommendation system??

2023年10月7日

Preparing data for a recommendation system??

Preparing data for a recommendation system involves organizing and structuring the data in a format that is suitable…
Pooling and Padding in CNN??

2023年10月7日

Pooling and Padding in CNN??

Pooling is a down-sampling operation commonly used in convolutional neural networks to reduce the spatial dimensions…
What is Computer Vision??

2023年10月7日

What is Computer Vision??

Computer vision is a multidisciplinary field that enables machines to interpret, analyze, and understand the visual…
PRUNING in Decision Trees

2023年10月5日

PRUNING in Decision Trees

Pruning is a technique used in decision tree algorithms to prevent overfitting and improve the generalization ability…

1 条评论
"NO" need to check for multicollinearity or remove correlated variables explicitly when using decision trees.

2023年10月5日

"NO" need to check for multicollinearity or remove correlated variables explicitly when using decision trees.

Multicollinearity is a phenomenon in which two or more independent variables in a regression model are highly…
MLOps concepts

2023年9月21日

MLOps concepts

MLOps, short for Machine Learning Operations, is a set of practices and tools that combines machine learning (ML) and…
Python library & It's Uses

2023年8月11日

Python library & It's Uses

NumPy: Numerical computing library for arrays, matrices, and mathematical functions. Pandas: Data manipulation and…
How much do you know about Weight initialization in Neural Networks ??

2023年8月9日

How much do you know about Weight initialization in Neural Networks ??

Weight initialization is a crucial step in training neural networks. It involves setting the initial values of the…

1 条评论

See all articles

ML Algorithms equations made simple

Vishwajit Sen

Data Science / AI / ML / DL Senior Manager

领英推荐

Vishwajit Sen的更多文章

社区洞察

其他会员也浏览了

Pre-processing data in Python for Machine Learning

A Practical Example for Improving ML Models with Multiple Linear Regression

Day 2: Logistic Regression

Machine Learning Roadmap

Gradient Boosting To Predict Hospital Length Of Stay

Mastering ARIMA Models for Time Series Forecasting

Gradient Descent | Demystified - with code using scikit-learn

Machine Learning Part 4: Predicting Aircraft Positions

Introduction to Scikit-learn Library in Python for Machine Learning

领英推荐

Vishwajit Sen的更多文章

Exploring new opportunities in Data Science

Technical indicators in the stock market:

Preparing data for a recommendation system??

Pooling and Padding in CNN??

What is Computer Vision??

PRUNING in Decision Trees

"NO" need to check for multicollinearity or remove correlated variables explicitly when using decision trees.

MLOps concepts

Python library & It's Uses

How much do you know about Weight initialization in Neural Networks ??

社区洞察

其他会员也浏览了

Pre-processing data in Python for Machine Learning

A Practical Example for Improving ML Models with Multiple Linear Regression

Day 2: Logistic Regression

Machine Learning Roadmap

Gradient Boosting To Predict Hospital Length Of Stay

Mastering ARIMA Models for Time Series Forecasting

Gradient Descent | Demystified - with code using scikit-learn

Machine Learning Part 4: Predicting Aircraft Positions

Introduction to Scikit-learn Library in Python for Machine Learning