[object Object]

by The digiLab Team

Updated 6 February 2023

Understanding Uncertainty Quantification: Propagation of Uncertainty

Gaining insight into the uncertainty associated with ML models
[object Object]

Are you looking to get the most accurate information from your models, and deploy them with confidence? Uncertainty quantification (UQ) is a set of essential computational tools that can help make sure your models are both accurate and reliable. And while UQ can be a complex subject, it doesn't have to be overwhelming! In this blog post, we will break down how Uncertainty Quantification works by discussing the concept of propagation of uncertainty — as well as looking at some practical applications "in the wild!" using two simple uncertainty propagation methods: Monte Carlo Simulation and Polynomial Chaos.

What is Uncertainty Quantification (UQ)?


Before we look at Uncertainty Propagation, let's briefly refresh our understanding of UQ. UQ is the process of quantifying and analyzing the uncertainty in mathematical models, simulations and data. The primary aim is to assess the reliability of predictions, account for the effects of variability, randomness and misspecification in models, and ultimately assist in decision-making.

In a previous article (Understanding Uncertainty Quantification: The Different Types) we looked at the different classifications of uncertainty, as well as their typical sources.

The key steps to uncertainty propagation

The principle of uncertainty propagation refers to the process of estimating and analyzing the effects of uncertainty in a system or model. The goal is to quantify how uncertainties in input variables (parameters) propagate through a system or model and affect the distribution of the final output.

The principle of uncertainty propagation is based on the following key steps:

  1. Identify sources of uncertainty: Identify all sources of uncertainty, such as measurement errors, model limitations, data scarcity, and parameter variability.

  2. Model the uncertainties: This involves representing the uncertainties using probability distributions or other mathematical models.

  3. Propagate the uncertainties: Use mathematical techniques, such as Monte Carlo simulation, Taylor series approximation, or polynomial chaos, to propagate the uncertainties through the model and estimate their impact on the output.

  4. Quantify the uncertainties: Use statistical measures, such as variance, confidence intervals, or sensitivity indices, to quantify the magnitude and impact of the uncertainties on the output.

  5. Evaluate the results: Interpret the results of the uncertainty propagation analysis and use them to make informed decisions or make improvements to the model.

Uncertainty propagation is an important step in the overall process of uncertainty quantification, as it helps to understand the sources and effects of uncertainty in a system and to make informed decisions based on the results.

Two approaches to Uncertainty Propagation


Let's have a look at two common / widely used methods used to carry out uncertainty propagation: Monte Carlo Simulation and Polynomial Chaos. We will then make a comparison between the two approaches, as one's strength is the weakness of the other.

Approach 1: Monte Carlo Simulation

Monte Carlo simulation is a statistical method for propagating uncertainty in a model or system by generating random samples of input variables and simulating the behaviour of the system many times. The results of the simulations are then analyzed to estimate the distribution of the output and quantify the uncertainty.

The Monte Carlo simulation recipe is as follows:

  1. Model definition: Define the mathematical model or system that is to be simulated, including the inputs, parameters, and output variables.

  2. Sampling input variables: Generate random samples of the input variables based on their probability distributions or other models of uncertainty.

  3. Running simulations: For each set of random inputs, run a simulation of the model to obtain an output value.

  4. Collecting outputs: Collect the outputs from the simulations and store them in a data set.

  5. Analyzing results: Analyze the data set of outputs to estimate the distribution of the output and quantify the uncertainty. This can be done using statistical measures, such as mean, variance, confidence intervals, or histograms.

Monte Carlo simulation is a powerful tool for propagating uncertainty and for solving problems that cannot be solved analytically. It is widely used in engineering, finance, economics, and many other fields, and has applications in risk assessment, decision-making, and optimization.

Here we provide a brief example of how you could implement it for a model with two inputs, which we assume are independently normally distributed.

import numpy as np 

# Define the input variables and their distribution 
input_mean = [0., 0.] 
input_cov =[[1.,0.], [0.,1.]] 
nSamples = 1000 

# Generate random samples of the input variables 
theta = np.random.multivariate_normal(x_mean, x_cov, nSamples) 

# Run the Monte Carlo Simulation 
f = [my_model(th[0], th[1]) for th in theta] 

# Compute the statistics of the output distribution 
mean = np.mean(f) 
variance = np.var(f)

There are three key advances of the Monte Carlo Method.

  1. Flexibility: Monte Carlo simulation can be applied to a wide range of models and problems, regardless of the underlying distribution of the input variables or the complexity of the model.

  2. Ease of Use: Monte Carlo simulation is relatively simple to implement.

  3. Robustness: Monte Carlo simulation is a robust method that can provide accurate results even when the distribution of the input variables is not well known or when the model is nonlinear or highly complex. The method is robust to very high dimensional (large numbers) uncertainty inputs.

Having said this, whilst Monte Carlo methods are widely used and feed into other important computational algorithms, there are a number of key disadvantages.

  1. Computational Intensity: Monte Carlo simulation can be computationally intensive, especially for large numbers of samples or for complex models. This can make it impractical for real-time applications or for large-scale problems.

  2. Slow Convergence: Monte Carlo simulation is based on random sampling, which introduces random or sampling error into the results. This error can be reduced by increasing the number of samples, but the convergence to the true output distribution is slow. Therefore to obtain good estimates of uncertainty, many samples are required.

  3. Model Validation: Monte Carlo simulation requires a well-defined model, and any inaccuracies or errors in the model will be propagated through the simulation and impact the results. This makes it important to validate the model before using it in a Monte Carlo simulation, and to carefully consider the sources of uncertainty and how they should be incorporated into the simulation. Failing to do so can result in misleading or inaccurate results.

Polynomial chaos

Polynomial chaos or Polynomial Chaos Expansion is a deterministic method that uses polynomials to represent the uncertain input variables, and then solves the model or system using these polynomials as inputs.

The steps involved in polynomial chaos are:

  1. Model definition: Define the mathematical model or system that is to be analyzed, including the inputs, parameters, and output variables.

  2. Input representation: Represent the uncertain input variables as a polynomial expansion. The polynomials are chosen based on the distribution of the inputs, such as Gaussian, uniform, or beta.

  3. Galerkin projection: Project the model or system onto the polynomial expansion, resulting in a set of deterministic equations that can be solved using standard numerical methods.

  4. Solving the system: Solve the deterministic system of equations to obtain the solution.

  5. Quantifying uncertainty: Use the coefficients of the polynomial expansion to quantify the uncertainty in the output. This can be done using statistical measures, such as mean, variance, confidence intervals, or sensitivity indices.

Polynomial chaos is a computationally efficient method for quantifying uncertainty and can provide fast and accurate results. It has applications in engineering, finance, economics, and many other fields, and is particularly useful for problems where the uncertainty is represented by a known distribution and where the model or system is expensive to solve.

Here is an example of how polynomial chaos can be implemented in Python using the chaospy library:

import chaospy as np 
import numpy as np 

theta_1 = cp.Uniform(-1.,1.) 
theta_2 = cp.Normal(0.,1.) 
inputs = [theta_1, theta_2] 

# Define the Polynomial Expansion. 
pe = cp.orth_ttr(2., inputs) 

# Propagate the uncertainty using polynomial chaos 

mean, variance = cp.E(my_model(pe), inputs)

So as with all methods, what are the disadvantages?

  1. Model Assumptions: Polynomial chaos assumes that the model can be approximated by a polynomial expansion, which may not be accurate or appropriate for all models. This can result in over- or under-estimating the uncertainty in the model and in poor predictions.

  2. Computational Cost: Polynomial chaos can be computationally intensive, especially for high-dimensional models or for models with large numbers of uncertain inputs. This can make it impractical for large-scale or real-time applications.

  3. Model Complexity: Polynomial chaos can struggle to accurately capture the complexity and nonlinear behaviour of some models, which can lead to errors in the predictions and poor uncertainty quantification.

Conclusion


Uncertainty Quantification is a vital area as we seek to better understand the variables and uncertainties when using and deploying models. An important part of this is the concept of uncertainty propagation. This is a set of methods which allow us to feed input uncertainty into the model and observe the distribution of possible outputs. Here we introduce two widely used methods, called Monte Carlo Simulation and Polynomial Chaos. These are both powerful tools that can help you get a handle on propagating uncertainty through your models. Why not try them out on a simple example today?

digiLab is developing a machine learning platform which provides easy access to state-of-the-art uncertainty quantification methods called twinLab.

The digiLab Team
We are a 30+ strong team of ML Experts, Software Developers, Solution Engineers, and Product Experts. As a spinout from the University of Exeter, we build on years of cutting-edge academic research. At our core is a commitment to helping engineering and infrastructure companies become data-driven.

Featured Posts

If you found this post helpful, you might enjoy some of these other news updates.