登录查看更多内容

What Uncertainties Do We Need to capture in Deep Learning? [with code]

Ibrahim Sobh - PhD

?? Senior Expert of Artificial Intelligence, Valeo Group | LinkedIn Top Voice | Machine Learning | Deep Learning | Data Science | Computer Vision | NLP | Developer | Researcher | Lecturer

发布日期: 2019年7月13日

+ 关注

There are two major types of uncertainty one can model

Aleatoric uncertainty captures noise inherent in the observations. This could be for example sensor noise or motion noise, resulting in uncertainty which cannot be reduced even if more data were to be collected. Aleatoric uncertainty draws its name from the Latin root aleatorius: the incorporation of chance into the creation. It describes randomness arising from the data generating process itself. This noise that cannot be eliminated by simply drawing more data.
Epistemic uncertainty accounts for uncertainty in the model parameters, uncertainty which captures our ignorance about which model generated our collected data, which can be explained away given enough data. Epistemic uncertainty is derived from the Greek root epistēmē, which refers to knowledge about knowledge. It measures our ignorance of the correct prediction arising from our ignorance of the correct model parameters.

Understanding what a model does not know is a critical part of many machine learning systems

Bayesian deep learning can capture:

Epistemic uncertainty: formalized as probability distributions over model parameters
Aleatoric uncertainty: formalized as probability distributions over model outputs

Aleatoric and epistemic uncertainty models are not mutually exclusive. The combination is able to achieve new state-of-the-art results.

Aleatoric uncertainty can further be categorized into:?

homoscedastic uncertainty: which stays constant for different inputs.?
heteroscedastic uncertainty: which depends on the inputs to the model, with some inputs potentially having more noisy outputs than others.

For example, Homoscedastic regression assumes constant observation noise σ for every input point x. Heteroscedastic regression, on the other hand, assumes that observation noise can vary with input x.

Approaches:

To capture epistemic uncertainty in a neural network (NN) we put a prior distribution over its weights, for example a Gaussian prior distribution: W samples from N (0, I). Bayesian neural networks replace the deterministic network’s weight parameters with distributions over these parameters, and instead of optimizing the network weights directly we average over all possible weights (marginalization). Bayesian inference is used to compute the posterior over the weights p(W|X, Y). This posterior captures the set of plausible model parameters, given the data. We can captured model uncertainty by approximating the distribution p(W|X, Y) via Dropout variational inference, a practical approach for approximate inference in large and complex models. This inference is done by training a model with dropout before every weight layer, and by also performing dropout at test time to sample from the approximate posterior (Monte Carlo dropout).

We can model Heteroscedastic aleatoric uncertainty just by changing our loss functions. Because this uncertainty is a function of the input data, we can learn to predict it using a deterministic mapping from inputs to model outputs. For example in regression, the model predicts not only a mean vale?y^?but also a variance?σ2. Similarly, Homoscedastic aleatoric uncertainty can be modeled in a similar way, but the uncertainty parameter will no longer be a model output, but a free parameter we optimize.

Conclusion?

领英推荐

Deep Learning vs Machine Learning

ParallelStaff 2 个月前

Enhancing SAT solvers with deep learning: A fusion of…

Porsche Digital 1 年前

Machine Learning

Bluechip Technologies Asia 7 个月前

Epistemic uncertainty is important for Safety-critical applications, because epistemic uncertainty is required to understand examples which are different from training data.
Aleatoric uncertainty is important for large data situations, where epistemic uncertainty is explained away and Real-time applications, where we can form aleatoric models without expensive Monte Carlo samples.

??Summary

?? Epistemic uncertainty:?

Uncertainty in the model parameters
Formalized as probability distributions over model parameters
Can be explained away given enough data
Instead of learning specific weight values, the Bayesian approach learns weight distributions, from which we can sample to produce an output for a given input.
The model produces a different output each time we call it with the same input, since each time a new set of weights are sampled from the distributions to construct the network and produce an output.
The less certain the mode weights are, the more variability (wider range) we will see in the outputs of the same inputs

???Aleatoric uncertainty:

Noise inherent in the observations
Formalized as probability distributions over model outputs
Cannot be reduced even if more data were to be collected
Create a probabilistic NN by letting the model output a distribution.
For example, model the output as a IndependentNormal distribution, with learnable mean and variance parameters.
The output is a distribution, and we can use its mean and variance to compute the confidence intervals (CI) of the prediction

References:

Kendall, A., & Gal, Y. (2017). What uncertainties do we need in bayesian deep learning for computer vision?. In?Advances in neural information processing systems?(pp. 5574-5584).
Uncertainty: a Tutorial https://blog.evjang.com/2018/12/uncertainty.html
Probabilistic Bayesian Neural Networks: Code https://keras.io/examples/keras_recipes/bayesian_neural_networks/

Thank you

What Uncertainties Do We Need to capture in Deep Learning? [with code]

Ibrahim Sobh - PhD

?? Senior Expert of Artificial Intelligence, Valeo Group | LinkedIn Top Voice | Machine Learning | Deep Learning | Data Science | Computer Vision | NLP | Developer | Researcher | Lecturer

领英推荐

??Summary

更多精彩文章

社区洞察

其他会员也浏览了

Deep learning

Explain by Example: Deep Learning (NN)

What is the difference between deep learning and usual machine learning?

Machine Learning

MACHINE LEARNING

What is machine learning (ML)?

Why Deep Learning Excites Me

Decoding Deep Learning: Pros and Cons

Deep Learning and the top 10 Predictions for 2019

IEEE FORMATING STYLE

领英推荐

??Summary

How to Learn Artificial Intelligence: A Beginner’s Guide

2024年5月31日

[????????????] ?????????????????? ???????????? explained with code ??

2023年1月28日

A conversation with ChatGPT about AI, study roadmap, applications, interview questions with answers, salaries, and more!

2023年1月21日

10 Object detectors with code [YOLOF, YOLOX, DETR, Deformable DETR, SparseR-CNN, VarifocalNet, PAA, SABL, ATSS, Double Heads]

2022年2月17日

FNet: Do we need the attention layer at all? [Explained with code]

2021年10月30日

Patches Are All You Need! [with code]

2021年10月28日

MLP is all you need! [with code]

2021年10月23日

9 Steps for solving any machine learning problem

2021年8月28日

Anatomy of the Beast with many heads! [with code]

2021年6月12日

The magic of XLM-R: Unsupervised Cross-lingual Representation Learning at Scale

2021年1月16日

社区洞察

其他会员也浏览了

Deep learning

Explain by Example: Deep Learning (NN)

What is the difference between deep learning and usual machine learning?

Machine Learning

MACHINE LEARNING

What is machine learning (ML)?

Why Deep Learning Excites Me

Decoding Deep Learning: Pros and Cons

Deep Learning and the top 10 Predictions for 2019

IEEE FORMATING STYLE