"All models are wrong, but some are useful" is a famous quote attributed to the British Statistician George Box. In this series, we explore the field of Uncertainty Quantification (aka UQ), a field which seeks to quantify how wrong a model might be. Uncertainty Quantification is an increasingly important field as engineering and science seek to rely more heavily on simulations and machine learning algorithms to make critical decisions. In this series, we introduce the key concepts, give examples of the principle algorithms and describe the academic challenges at the frontiers of Uncertainty Quantification.
In this blog, we explain what Uncertainty Quantification is, and differentiate the principal classes of uncertainty.
What is Uncertainty Quantification?
Uncertainty Quantification is the process of quantifying and analyzing the uncertainty in mathematical models, simulations and data. The primary aim is to assess the reliability of predictions, account for the effects of variability, randomness and misspecification in models, and ultimately assist in decision-making. Uncertainty Quantification is an increasingly important interdisciplinary field which combines statistical, computational and mathematical methods to estimate, propagate and bound the uncertainty in models.
Why Is Uncertainty Quantification Important?
UQ helps us understand how our models and systems will behave in different situations by giving us insight into the sources of uncertainty present in our data and models. This allows us to make better decisions when designing new products or processes because we have a better understanding of how they will react to changes in their environment or input parameters. Additionally, UQ also helps us make predictions about future events based on past data by taking into account current sources of uncertainty so that we can anticipate potential outcomes more accurately.
Two Classifications of Uncertainty
There are two primary classifications of uncertainty: aleatoric and epistemic. Let's consider each in turn.
- Aleatoric uncertainty is a type of uncertainty that originates from the inherent randomness in the system being modelled or measured. It is also referred to as stochastic uncertainty or random uncertainty. Aleatoric uncertainty arises from random variations or events that cannot be predicted with certainty, for example, measurement errors or natural variability or fluctuations in the underlying process.
- Epistemic uncertainty is a type of uncertainty that arises from the limited knowledge, data, or information available about the system being modelled or measured. Unlike aleatoric uncertainty, epistemic uncertainty can be reduced through additional data collection, modelling or analysis. In other words, epistemic uncertainty is a measure of how well we understand a system, rather than a measure of the system's inherent variability. Epistemic uncertainty can be represented as a lack of information about model parameters, missing data, or limited knowledge of the underlying physical processes. It is usually quantified using techniques such as sensitivity analysis, model averaging, or Bayesian inference, and can be reduced by improving the data, models or understanding of the system.
Whilst Aleatoric and Epistemic Uncertainty are referred to as two distinct classes, in practice, they often come together. In such cases, we refer to this combined uncertainty as hybrid uncertainty.
What are the primary sources of Uncertainty?
The primary source of uncertainty depends on the application, and often they come in combination rather than alone. Here are 6 primary sources of uncertainty which have been widely studied.
- Measurement errors: Random or systematic errors in measuring data.
- Modelling limitations / Model Misspecification: Inaccuracies or simplifications in mathematical models or simulations
- Data scarcity or incompleteness: Limited or missing data can affect the accuracy of a model, meaning that the outputs of the system have not been fully observed, hence model must interpolate or extrapolate away from data. Introduction
- Parameter Uncertainty: Even for a given model (which is wrong) the best parameterization of the model is uncertain.
- Natural variability: Random or unpredictable variability in the underlying physical processes
- Computational errors: Numerical errors or inaccuracies in the computation of models or simulations, this might occur to finite precision, discretisation error (i.e. element or time step size)
Uncertainty quantification (UQ) is an important tool for engineers, scientists, and data professionals when it comes to understanding how their models and systems will behave in different situations. By identifying and quantifying sources of uncertainty such as aleatoric, epistemic, hybrid uncertainties and more - UQ helps us make better decisions based on our data insights while also allowing for more accurate predictions about future events based on past data points. Investing time into understanding this process can help you get the most out of your research projects!
Want to master these Machine Learning tools (and more) alongside the digiLab team?
Want to get an edge over other job applicants? Check out our online machine learning course "AI in the Wild" run by digiLab Academy and learn the fundamentals of Uncertainty Quantification.
digiLab is developing a machine learning platform which provides easy access to state-of-the-art uncertainty quantification methods called twinLab