Must-Know Terminologies in Machine Learning
Must-Know Terminologies in Machine Learning - By Khushi Choudhary

Must-Know Terminologies in Machine Learning

We’ve already covered 5 days of our 100 days of machine learning journey, and it’s been a fantastic ride so far. From understanding the basics of machine learning to diving into supervised learning, we’ve made some solid progress together. But hey, it's Sunday today—so why not let our brains rest a bit?

Instead of diving straight into another complex concept, let's take a breather and learn some of the key terminologies that we’ll be using in the coming days. This will help us feel more prepared and confident as we move forward. Think of this as a light yet essential pit stop—one that’ll make your future learning smoother.

Let's explore some of the most important terms in machine learning. Even though today’s content isn’t technically part of our 100 days, it’s just as crucial to your overall understanding. Let's jump in!

Alright! Here’s a list of 50 additional machine learning terminologies, along with easy-to-understand explanations. These terms are commonly used in machine learning, and understanding them will significantly enhance your knowledge.

1. A/B Testing

A statistical method to compare two versions of a model or web page to see which performs better. It’s widely used for experimentation.

2. Activation Function

A mathematical function applied to the output of a neuron in a neural network, such as ReLU, Sigmoid, or Tanh. It decides if the neuron should be activated or not.

3. Backpropagation

An algorithm used for training neural networks by adjusting the weights based on the error in the output. It works by propagating the error backward from the output to the input.

4. Batch Size

The number of training examples used in one iteration of the model training process. Larger batch sizes can make learning faster but require more memory.

5. Bayesian Inference

A method of statistical inference where Bayes' Theorem is used to update the probability of a hypothesis as more evidence becomes available.

6. Bias-Variance Tradeoff

The balance between two sources of error in a model. High bias can lead to underfitting, while high variance can lead to overfitting. A good model should have low bias and low variance.

7. Classification

A type of supervised learning where the goal is to categorize data into predefined labels, like identifying an email as spam or not spam.

8. Clustering

A form of unsupervised learning where data is grouped into clusters based on similarity. Examples include K-Means clustering.

9. Confusion Matrix

A table used to evaluate the performance of a classification algorithm. It shows the true positives, false positives, true negatives, and false negatives.

10. Cost Function

A function that measures the error between the predicted outcome and the actual outcome. The goal of training a model is to minimize the cost function.

11. Cross-Validation

A technique used to assess how well a model generalizes to unseen data. The data is split into training and testing sets, and the model is evaluated multiple times.

12. Data Augmentation

The process of artificially increasing the size of a dataset by creating modified versions of the original data, such as flipping, rotating, or adding noise to images.

13. Decision Tree

A type of algorithm used for both classification and regression tasks. It splits data into branches based on decisions made at nodes, ultimately leading to a prediction.

14. Deep Neural Networks (DNN)

A type of neural network with multiple layers between the input and output. Deep neural networks are the foundation of deep learning.

15. Dimensionality Reduction

A technique used to reduce the number of features in a dataset while retaining as much information as possible. Examples include Principal Component Analysis (PCA) and t-SNE.

16. Dropout

A regularization technique used in neural networks to prevent overfitting. During training, a random set of neurons is "dropped" or ignored to make the model more robust.

17. Early Stopping

A technique used to prevent overfitting by stopping the training process when the model’s performance on a validation set stops improving.

18. Ensemble Learning

A technique where multiple models are trained and combined to solve a particular problem, typically leading to better performance than any individual model. Examples include Random Forests and Gradient Boosting.

19. Epoch

One complete pass of the entire dataset through the training process. Usually, training involves multiple epochs to achieve the desired accuracy.

20. Exploratory Data Analysis (EDA)

The process of analyzing data sets to summarize their main characteristics, often using visual methods like histograms, scatter plots, and box plots.

21. Feature Engineering

The process of selecting, modifying, or creating new input features from raw data that help improve the performance of a model.

22. Feature Importance

A technique used to identify which features contribute the most to making predictions in a model.

23. Gradient Descent

An optimization algorithm used to minimize the cost function by iteratively adjusting the parameters (weights and biases) in the direction of the steepest descent.

24. Hyperparameter

Parameters of the learning algorithm itself (not learned from the data) that must be set before training, such as learning rate, batch size, or the number of layers in a neural network.

25. Label Encoding

A method of converting categorical data into numerical data so that machine learning algorithms can process it. For example, labels like "cat" and "dog" are transformed into 0 and 1.

26. Learning Rate

A hyperparameter that controls how much the model’s weights are adjusted with respect to the loss gradient during training. Too high of a learning rate might cause the model to converge too quickly to a suboptimal solution, while too low may result in a slow learning process.

27. Logistic Regression

A classification algorithm used to predict a binary outcome (0 or 1) based on one or more predictor variables.

28. Loss Function

Similar to a cost function, it measures how well the model’s predictions match the actual outcomes. The goal is to minimize the loss function during training.

29. Mean Squared Error (MSE)

A common loss function used for regression tasks, which calculates the average of the squared differences between predicted and actual values.

30. Neural Network

A series of algorithms that attempt to recognize underlying relationships in data by mimicking the way the human brain operates. They are used extensively in deep learning.

31. Normal Distribution

Also known as the Gaussian distribution, it's a probability distribution that is symmetric around the mean, showing that data near the mean are more frequent in occurrence.

32. Normalization

The process of scaling input features so they have a consistent range, often between 0 and 1 or -1 and 1, to improve the performance of machine learning algorithms.

33. One-Hot Encoding

A method of converting categorical data into a binary format so that a machine learning algorithm can process it. For example, categories like "apple," "banana," and "orange" might be represented as [1,0,0], [0,1,0], and [0,0,1].

34. Optimization

The process of adjusting model parameters to minimize or maximize some objective function, such as the loss function in machine learning.

35. Overfitting

When a model learns not only the underlying patterns in the data but also the noise. It performs well on training data but poorly on unseen test data.

36. P-Value

A statistical measure that helps to determine the significance of results. In machine learning, it is often used in feature selection to identify which features are most relevant.

37. Parameter

A variable that the model learns from the data during training. In contrast, hyperparameters are set manually and do not change during training.

38. Perceptron

A simple type of artificial neuron that forms the basis of neural networks. It takes input, processes it, and produces an output.

39. Principal Component Analysis (PCA)

A technique used for dimensionality reduction that transforms the data into a set of orthogonal (uncorrelated) variables called principal components.

40. Precision

The ratio of correctly predicted positive observations to the total predicted positives. Precision helps answer the question: "Of all items labeled as positive, how many are actually positive?"

41. Recall

The ratio of correctly predicted positive observations to all actual positives. Recall answers the question: "Of all actual positive cases, how many were correctly identified?"

42. Regression

A type of supervised learning used to predict continuous outcomes, such as predicting house prices or temperatures.

43. Reinforcement Learning

A type of machine learning where an agent learns by interacting with an environment and receiving feedback in the form of rewards or penalties.

44. Regularization

A technique used to prevent overfitting by adding a penalty term to the loss function to discourage overly complex models.

45. Residual

The difference between the actual value and the predicted value in a regression model. It represents the error of the prediction.

46. Softmax

An activation function used in neural networks, particularly in classification problems, that converts raw predictions into probabilities.

47. Stochastic Gradient Descent (SGD)

An optimization algorithm similar to gradient descent but operates on a single or small batch of data rather than the entire dataset, speeding up the training process.

48. Support Vector Machine (SVM)

A classification algorithm that finds a hyperplane to best separate different classes in the data.

49. Training Data

The dataset used to train a machine learning model. The model learns from this data and uses it to adjust its parameters.

50. Validation Set

A subset of the data used to evaluate the model’s performance during training, helping in the tuning of hyperparameters and preventing overfitting.


By familiarizing yourself with these terms, you’ll gain a stronger understanding of the core concepts in machine learning. These terminologies often pop up in research papers, discussions, and tutorials, so having a firm grasp on them will make your learning journey smoother!

Catch Up on Previous Days -

Useful Links - NLP Series

100 Days of Machine Learning

Data Science Resources


Woodley B. Preucil, CFA

Senior Managing Director

5 个月

Khushi Choudhary Very well-written and thought provoking

Abhisek Ganguly

Founder & CEO, EduMettle

5 个月

A Good reference dictionary for those just starting their journey in ML!

Goitom Yemane Welay

||Data Scientist|| Machine Learning|| Public Health Specialist|| Research Assistant|| MSc in Biostatistics

5 个月

Very helpful thank you

要查看或添加评论,请登录

Khushi Choudhary的更多文章

  • Day 25: Naive Bayes Algorithm for Classification

    Day 25: Naive Bayes Algorithm for Classification

    What is the Naive Bayes Algorithm? Naive Bayes is a classification algorithm that helps us decide which group (or…

  • Day 24: Support Vector Machine (SVM) for Classification

    Day 24: Support Vector Machine (SVM) for Classification

    Support Vector Machine (SVM) is a powerful tool used in machine learning to solve classification problems. The concept…

    1 条评论
  • Day 23: Decision Trees for Classification

    Day 23: Decision Trees for Classification

    Imagine you’re playing a game where you have to guess what fruit someone is thinking about. They give you clues like…

  • K-Nearest Neighbors (KNN) for Classification

    K-Nearest Neighbors (KNN) for Classification

    In the journey of learning classification algorithms, K-Nearest Neighbors (KNN) holds a unique place due to its…

  • What is Logistic Regression?

    What is Logistic Regression?

    In our previous articles, we explored various regression techniques, focusing on Linear Regression as a foundational…

  • Hands-On Regression Project – Housing Price Prediction

    Hands-On Regression Project – Housing Price Prediction

    Welcome to Day 20! Today, we’re taking a step toward a practical machine learning project: predicting housing prices…

    1 条评论
  • Overfitting vs Underfitting in Regression Models

    Overfitting vs Underfitting in Regression Models

    Welcome to Day 19 of our 100 Days of Machine Learning challenge! Today, we’ll dive into two common issues that can…

  • Regularization Techniques (L1 and L2)

    Regularization Techniques (L1 and L2)

    Welcome to Day 18 of the 100 Days of Machine Learning series! Today, we're diving deep into an essential topic for…

  • Evaluation Metrics for Regression

    Evaluation Metrics for Regression

    Welcome to Day 17 of the 100 Days of Machine Learning series! Today, we’ll delve into one of the most critical aspects…

  • Gradient Descent – Basic and Variants

    Gradient Descent – Basic and Variants

    Welcome to Day 16 of our 100 Days of Machine Learning series! Today, we explore one of the foundational optimization…

社区洞察

其他会员也浏览了