Discriminative vs Generative model
Ibrahim Sobh - PhD
?? Senior Expert of Artificial Intelligence, Valeo Group | LinkedIn Top Voice | Machine Learning | Deep Learning | Data Science | Computer Vision | NLP | Developer | Researcher | Lecturer
Broadly speaking, there are two main approaches to understand (model) the world and making decisions.
- Discriminative Models: Directly maps features to class labels
- Generative models: Models the distribution of features for each class.
Discriminative Models:
Consider a simple classification problem, where we need to distinguish between dogs and cats. Given the training data, we try to find the decision boundary that separates the two classes. At testing time, we check on which side the new input falls to decide whether it is a dog or cat.
The algorithm is trying to learn p(y|x) directly, to map from inputs to labels.
where y is the class label (dogs or cats) and x represents features.
Example: Logistic regression model
Generative Models:
Here we try to build a model of what dogs look like, and another model of what cats look like. At test time, we check whether the new sample looks more like cats or more like dogs.
The algorithm is trying to model p(x|y) and p(y) to understand the data
where p(x|y) models the distribution of features belonging of a certain class. And p(y) is called the class prior, the probability of the class.
Having both p(x|y) and p(y) modeled, Bayes Rule can be used to derive the posterior distribution
p(y|x) = p(x|y)p(y)/p(x)
To make predictions, we find the y that maximizes the term p(y|x), which class gives larger probability given the features, and in this case we can get rid of the denominator
argmax_y p(y|x) = argmax_y p(x|y) p(y)
Example: Na?ve Bayesian Model
Which one to use?
The answer is it depends! Here we discuss the difference between Logistic regression model and the Gaussian Discriminant Analysis model (GDA). GDA is a generative learning algorithm that assumes p(x|y) is distributed according to multivariate normal distribution. This assumption is valid in many cases. In this way, GDA makes strong modeling assumptions and usually works very well under theses assumptions. On the other hand, logistic regression is a discriminative algorithm used to model p(y|x) directly. Logistic regression makes weaker assumptions and hence more robust when data is not Gaussian.
Regards.