Gaussian Error Models#

Standard Gaussian Error Models with Binary Design#

This module implements the most basic form of a Gaussian error model with support for binary design. The role of a binary design is to activate/deactivate entries (dimensions) of the random variable (probability space) modeled by the Gaussian error model implemented here.

The Gaussian error model here is defined as follows:

\[x \sim \mathcal{N}(\mu, \mathbf{D} \mathbf{\Sigma} \mathbf{D} ),\]

where \(x\) is a normally distributed random vector with mean \(\mu\) and covariance matrix \(\mathbf{\Sigma}\). The matrix \(\mathbf{D}\) is a diagonal (square) matrix with a binary design vector \(\mathbf{\zeta}\) on its diagonal. An entry in \(\mathbf{\zeta}\) set to 1 corresponds to an active entry, and an entry set to 0 corresponds to an inactive (degenerate) entry that is eliminated from calculations.

class GaussianErrorModelConfigs(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', size=None, mean=0.0, variance=1.0, design=True, sparse=True, random_seed=None)[source]#

Bases: ErrorModelConfigs

Configuration settings for Gaussian error models.

Parameters:
  • verbose (bool) – a boolean flag to control verbosity of the object.

  • debug (bool) – a boolean flag that enables adding extra functionlity in a debug mode

  • output_dir (str | Path) – the base directory where the output files will be saved.

  • size (int | None) – dimension of the error model space. Detected from mean if None.

  • mean (float | ndarray) – mean of the error model

  • variance (float | ndarray | spmatrix) – variance/covariance of the error model

  • design (None | bool | Sequence[bool] | ndarray) –

    an experimental design to define active/inactive entries of the random variable (mean, variance/covariance matrix).

    • If the design is None, it is set to all ones; that is everything is observed (default)

    • If the design is a binary vector ( or int dtype attributes with 0/1 entries) the mean, the covariance, and all random vectors are projected onto the space identified by the 1/True entries.

  • sparse (bool) – convert covariance to scipy.csc_matrix used if size > 1

  • random_seed (int | None) – random seed used when the Gaussian model is initiated. This is useful for reproductivity.

size: int | None#
mean: float | ndarray#
variance: float | ndarray | spmatrix#
design: None | bool | Sequence[bool] | ndarray#
sparse: bool#
random_seed: int | None#
__init__(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', size=None, mean=0.0, variance=1.0, design=True, sparse=True, random_seed=None)#
class GaussianErrorModel(configs=None)[source]#

Bases: ErrorModel, RandomNumberGenerationMixin

A simple Numpy-based (or scipy sparse) Gaussian error model. In addition to standard functionality, here we provide a specific attribute ‘design’ which enables modifying the space of the error model by projection or reweighting.

Note

This version does not support non-binary designs. For non-bianry (relaxed) design, consider PointwiseWeightedGaussianErrorModel or PrePostWeightedGaussianErrorModel.

Note

If mean, or variance is scalar, the size is inferred from size or from the argument with larger size, as long as there is no contradiction, otherwise an AssertionError is raised.

Note

This model creates lazy objects under the hood. These are lazy in the sense that they are constructed/calculated only when they are needed, and are kept in memory unless the design changes. These objects are accessible through the following properties:

  1. stdev: this refers to a lazy evaluation of the lower Cholesky factor of the covariance matrix.

  2. stdev_inv: this refers to a lazy evaluation of the inverse of the lower Cholesky factor of the covariance matrix.

  3. stdev_logdet: this refers to a lazy evaluation of the logarithm of the the determinant (logdet) of the lower Cholesky factor of the covariance matrix (used for normalizing the PDF valued/gradient when/if needed).

Note

We use the fact that the logdet of the covariance matrix is obtained from the the Cholesky factorization as follows

\[logdet (C) = log (det(L L^T)) = log (det(L) det(L^T)) = logdet(L) + logdet (L^T) ,\]

Thus, \(0.5 logdet(C) = logdet(L)\), a fact used in calculating the (normalized) PDF, etc.

Parameters:

configs (GaussianErrorModelConfigs | dict | None) – an object containing configurations of the error model

Raises:
__init__(configs=None)[source]#

Initialize the random number generator

standardize_configurations(configs, mean_dtype=<class 'float'>, variance_dtype=<class 'float'>, design_dtype=<class 'bool'>)[source]#

Check the configurations of the error model (size, mean, design, and variance), and make sure they are consistent.

Note

Here are the rules:

  1. If the size is passed, the mean and the design must be one-dimensional arrays/vectors of this size (if either is scalar, they are casted to vectors), and the variance’s size/shape is asserted accordingly.

  2. If the size is not passed, it is inferred from mean and/or variance

Raises:

PyOEDConfigsValidationError – if the validation of the configurations fail

validate_configurations(configs, raise_for_invalid=False)[source]#

Validate the passed configurations for the Gaussian error model

Parameters:
  • configs (dict | GaussianErrorModelConfigs) – configurations to validate. If a configs dataclass is passed validation is performed on the entire set of configurations. However, if a dictionary is passed, validation is performed only on the configurations corresponding to the keys in the dictionary.

  • raise_for_invalid (bool) – raise an exception if the configurations are invalid

Returns:

True if the configurations are valid, otherwise False

Raises:
update_configurations(**kwargs)[source]#

Take any set of keyword arguments, and lookup each in the configurations, and update as nessesary/possible/valid

Raises:

TypeError – if any of the passed keys in kwargs is invalid/unrecognized

generate_white_noise(ignore_design=False, truncate=False)[source]#
Generate a standard normal random vector of size size with values truncated

at -/+3 if truncate is set to True

Parameters:
  • ignore_design (bool) – the size of the white noise vector is in the full (ignore the design) if True, otherwise the size matches the dimension of the active subspace.

  • truncate (bool) – if True, truncate the samples at -/+3, that is any sample point/entry above 3 is set to 3, and any value below -3 is set to -3.

Returns:

numpy array (even if the dimension is 1) containing white noise

generate_noise()[source]#

Generate a random noise vector sampled from the underlying Gaussian distribution

sample()[source]#

Sample a random vector from the underlying Gaussian distribution

pdf(x, normalize=False, log=False)[source]#

Evaluate the value of the density function (normalized or upto a fixed scaling constant) at the passed state/vector.

Parameters:
  • x – realization of the random variable

  • normalize (bool) – scale the PDF by the normalization factor

  • log (bool) – if True, evaluate the gradient of the logarithm of the PDF

Returns:

the value of the density function evaluated at the passed state/vector.

log_density(x, normalize=False)[source]#

Evaluate the logarithm of the density function at the passed state x.

Parameters:
  • x – realization of the random variable

  • normalize (bool) – scale the PDF by the normalization factor

Returns:

the logarithm of the density function evaluated at the passed state.

pdf_gradient(x, normalize=False, log=False, in_place=False)[source]#

Evaluate the gradient of the density function at the passed state/vector.

Parameters:
  • x – realization of the random variable

  • normalize (bool) – scale the PDF by the normalization factor

  • log (bool) – if True, evaluate the gradient of the logarithm of the PDF

Returns:

the value of the density function evaluated at the passed state/vector.

log_density_gradient(x, normalize=False, in_place=False)[source]#

Evaluate the gradient of the logarithm of the density function at the passed state x.

Parameters:
  • x – realization of the random variable

  • normalize (bool) – scale the PDF by the normalization factor

Note

The derivative of the log of the normalization factor vanishes; thus, normalize takes no effect here. Just added for unification

Returns:

the gradient of the logarithm of the density function evaluated at the passed state.

add_noise(x, in_place=False)[source]#

Add random noise to the passed vector x

Parameters:
  • x – vector in a space of dimension equal to size of the underlying error model

  • in_place (bool) – overwrite the passed vector x (if True). If set to False, a new instance (based on the distribution size) is created and returned. This is ignored of course if x is a non-iterable, i.e., a scalar.

Raises:

TypeError – if x is of the wrong shape or type

covariance_matvec(x, in_place=False)[source]#

Evaluate and return the product of covariance matrix by a vector x. The product is determined by applying the cholesky decomposition of the covariance matrix twice (dot product) to the passed state x

Parameters:
  • x – vector in a space of dimension equal to the size of the underlying error model. x could be scalar or number array, however, it must match the distribution size.

  • in_place (bool) – overwrite the passed vector x (if True). If set to False, a new instance (based on the distribution size) is created and returned. This is ignored of course if x is a non-iterable, i.e., a scalar.

Returns:

the product of the covariance matrix by x

unweighted_covariance_matrix()[source]#

Construct a numpy array representation of the covariance matrix discarding any design effect

unweighted_precision_matrix()[source]#

Construct a numpy array representation of the precision matrix discarding any design effect

covariance_inv_matvec(x, in_place=False)[source]#

Evaluate and return the product of inverse of the error covariance matrix by a vector x. The inverse of the lower Cholesky factor is used

Parameters:
  • x – vector in a space of dimension equal to the size of the underlying error model. x could be scalar or number array, however, it must match the distribution size.

  • in_place (bool) – overwrite the passed vector x (if True). If set to False, a new instance (based on the distribution size) is created and returned. This is ignored of course if x is a non-iterable, i.e., a scalar.

Returns:

the product of inverse of the error covariance matrix by x

covariance_sqrt_matvec(x, lower=True, in_place=False)[source]#
Evaluate and return the product of Square root (Lower Cholesky) covariance matrix

by a vector x. The product is determined by applying the cholesky decomposition of the covariance matrix once (dot product) to the passed state x This metho is tricky. See remarks below to understand how it works.

Note

This function multiples the Cholesky factor (lower or upper/lower-transpose) by a vector. The covariance matrix is weighted based on the underlying design. When the design has some zero entries (rows/columns of the covariance are eliminated), the dimension of the model reduces. Let us assume the full space has size \(n\) and the design is all ones, then the covariance matrix is \(C:=LL^T\) where \(L\) is the lower Cholesky factor of size \(n\times n\). Thus, applying the lower Cholesky factor (matvec-product) takes a vector in the full space (size \(n\)) and returns a vector of the same size. However, when the design has zeros the output of the lower Cholesky matvec product is a vector of size equal to the number of active entries. Similar analogy is followed for multiplying the Upper Cholesky factor \(L^T\). In all cases, when lower is True, the input must reside in the full observation space, otherwise, the Cholesky factor is no longer valid and must be recalculated in the reduced space for the projected covariance.

Parameters:
  • x – vector in a space of dimension equal to the size of the underlying error model. x` could be scalar or number array, however, it must match the distribution size.

  • lower (bool) – use the lower Cholesky as the square root

  • in_place (bool) – overwrite the passed vector x (if True). If set to False, a new instance (based on the distribution size) is created and returned. This is ignored of course if x is a non-iterable, i.e., a scalar.

Returns:

the product of the sqrt-covariance matrix by x:

  • when lower==True, the output is in the space of active sensors. The input can be either in the full space or the space of active sensors if in_place==False, however it must be in the reduced space if in_place==True otherwise a ValueError is raised.

  • when lower==False, the output is in the full space. The input must be in the space of active sensors and in_place must be set to False otherwise a ValueError is raised.

covariance_sqrt_inv_matvec(x, lower=True, in_place=False)[source]#

Evaluate and return the product of inverse of the square root of the error covariance matrix by a vector x. A linear system is solved using the cholesky decomposition of the covariance matrix

Parameters:
  • x – vector in a space of dimension equal to the size of the underlying error model. x could be scalar or number array, however, it must match the distribution size.

  • lower (bool) – use the lower Cholesky as the square root

  • in_place (bool) – overwrite the passed vector x (if True). If set to False, a new instance (based on the distribution size) is created and returned. This is ignored of course if x is a non-iterable, i.e., a scalar.

Returns:

the product of inverse of the square root of the error covariance matrix by x

covariance_matrix()[source]#

Construct a numpy array representation (if the dimension is more than 1) of the underlying covariance matrix. (projected onto the observation space)

precision_matrix()[source]#

Construct a numpy array representation (if the dimension is more than 1) of the precision matrix (inverse of the underlying covariance matrix).

covariance_diagonal()[source]#

Return a copy the diagonal of the covariance matrix, i.e., the variances

Returns:

1D numpy array holding the diagonal of the underlying covariance matrix. If the size of the underlying distribution is 1, this returns a scalar (variance)

variance()[source]#

Same as covariance_diagonal()

covariance_trace(method='exact', sample_size=50, optimize_sample_size=False, min_sample_size=10, max_sample_size=100, distribution='Rademacher', random_seed=None, accuracy=0.01)[source]#

Evaluate/Approximate the trace of the covariance matrix

Note

This is a wrapper around the utility function pyoed.utility.math.matrix_trace that is called with the method covariance_matvec() as the operator to compute the trace of.

Parameters:
  • method – Method of evaluation. If “exact”, we simply sum the diagonal of A. If “randomized”, a Hutchinson style trace estimator, as described in pyoed.utility.trace_estimator(), is employed. All of the futher keyword arguments are in the case of method=”randomized” and are otherwise ignored if method=”exact”.

  • randomization_vectors – The randomization vectors drawn to perform the computation. If not passed, they will be sampled.

  • sample_size (int) – The number of samples to use. :param optimize_sample_size: if True, sample_size is replaced with an optimized version, where the optimal sample size minimizes the variance of the estimator and is between min_sample_size, and max_sample_size.

  • min_sample_size (int) – Minimum number of random samples.

  • max_sample_size (int) – Maximum number of random samples.

  • distribution (str) – The probability distribution used for random sampling. Both ‘Rademacher’ and ‘Gaussian’ are supported.

  • random_seed (None | int) – The random seed to use for the distribution samples.

  • accuracy (float) – convergence criterion.

Returns:

exact/approximate value of the trace of the covariance matrix. of the underlying model.

Return type:

float

covariance_logdet(method='exact', sample_size=50, optimize_sample_size=False, min_sample_size=10, max_sample_size=100, distribution='Rademacher', random_seed=None, accuracy=0.01)[source]#

Evaluate/approximate the logarithm of the determinant of the covariance matrix of the passed error model

Note

This is a wrapper around the utility function pyoed.utility.math.matrix_logdet that is called with the method covariance_matvec() as the operator to compute the trace of.

Parameters:
  • method – Method of evaluation. If “exact”, we simply sum the diagonal of A. If “randomized”, a chebyshev-approximation type-method is employed from pyoed.utility.logdet_estimator(). All of the futher keyword arguments are in the case of method=”randomized” and are otherwise ignored if method=”exact”.

  • randomization_vectors – The randomization vectors drawn to perform the computation. If not passed, they will be sampled.

  • sample_size (int) – The number of samples to use. :param optimize_sample_size: if True, sample_size is replaced with an optimized version, where the optimal sample size minimizes the variance of the estimator and is between min_sample_size, and max_sample_size.

  • min_sample_size (int) – Minimum number of random samples.

  • max_sample_size (int) – Maximum number of random samples.

  • distribution (str) – The probability distribution used for random sampling. Both ‘Rademacher’ and ‘Gaussian’ are supported.

  • random_seed (None | int) – The random seed to use for the distribution samples.

  • accuracy (float) – convergence criterion.

Returns:

exact/approximate value of the logdet of the covariance matrix of the underlying model.

Return type:

float

update_design(design)[source]#

If the design matches existing one, don’t do anything, otherwise, update it

Parameters:

design – the design to be used update model’s configurations

Raises:

PyOEDConfigsValidationError – if the passed design is of invalid type/shape/size

update_mean(mean)[source]#

Update the mean of the error model (in the full space) regardless the design dimensionalty

Parameters:

mean – the mean to be used update model’s configurations

Raises:

PyOEDConfigsValidationError – if the passed mean is of invalid type/shape/size

update_covariance_matrix(variance)[source]#

Update the covariance matrix (and the square root) in the full space, regardless the design

Parameters:

variance – covariance matrix (or its square root) to be used in the error model

apply_weighting_matrix(variance, design=None)[source]#

Apply the weighting matrix (binary activation/deactivation) of the elements in variance. Is expected to be of dimension (design.size, design.size ). Rows/columns in variance corresponding to 0 in design are zero’d out. If variance has dimension equal to active entries only in design, nothing is done, and variance is returned as is.

Parameters:
  • variance – the variance covariance matrix

  • design – an experimental design used to calculate the weights. This is only added for flexibility. If None, the internal/current deesign is used.

Returns:

variance with inactive entries in design or registered design set to ‘0’

property sparse#

True if the covariance data structure is constructed using scipy.spmatrix)

property size#

Dimension of the underlying probability distribution

property design#

Get a copy of the design vector

property extended_design#

Return a vector indicating active/inactive entries (with proper replication for multiple prognostic variables) of the observation vector/range Right now, the model is agnostic to any prognostic (physics) variables and is left to other modules to handle such periodicity if exists. Thus, this function is identical to the design()

property active#

entries (indexes) corresponding to active/inactive dimensions

property active_size#

Dimension of the probability distribution projected onto the design space (only active/nonzero entries of the design)

property mean#

Get a copy of the distribution mean

property stdev#

Construct and return the lower Cholesky factor (if not available) in the projected space (after applying the design) of the covariance matrix

property stdev_inv#

Construct and return the inverse of the lower Cholesky factor (if not available) of the covariance matrix in the projected space (after applying the design)

property stdev_logdet#

Evaluate the log-det (logarithm of the determinant) of the lower Cholesky factor (if not available) in the projected space (after applying the design) of the covariance matrix

Note

We use the fact that the determinant is the product of the diagonal elements of the lower Cholesky factor. Thus, the log-det is the sum of log of diagonal elements.

Gaussian Error Models with Pre-Post-Weighted Design#

This module implements weighted versions of Gaussian error model with support for binary as well as relaxed (non-binary) design. The role of a design is to activate/deactivate entries (dimensions) if it is binary, however, if it is relaxed, the design applies some weights to the entries of the covariance (and the precision) matrix based on the formulation approach taken.

The approach taken here is pre-post multiplication. This means either the covariance matrix (or its inverse) is multiplied (pre -> from left, and post -> from right) by a diagonal matrix with the design vector (relaxed) on its main diagonal.

Note

This is reduces to the functionality provided by GaussianErrorModel only when the design is binary, and when the covariance is pre-post multiplied.

Warning

The functionality here is provided for testing widely used approaches in the literature of optimal experimental design. We have shown, however, that pre-post multiplication of either the covariance matrix or its inverse by relaxed design is generally wrong if the covariance matrix is non-diagonal, that is if there are correlations in the error model. To apply proper weighting for relaxed (non-binary) designs, use the functionality provided by PointwiseWeightedGaussianErrorModel instead of using PrePostWeightedGaussianErrorModel. For details, on why pre-post multiplication is incorrect, see:

  1. Ahmed Attia, and Emil Constantinescu. “Optimal experimental design for inverse problems in the presence of observation correlations.” SIAM Journal on Scientific Computing 44, no. 4 (2022): A2808-A2842.

class PrePostWeightedGaussianErrorModelConfigs(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', size=None, mean=0.0, variance=1.0, design=True, sparse=True, random_seed=None, design_weighting_scheme='covariance')[source]#

Bases: GaussianErrorModelConfigs

Configuration settings for PrePostWeightedGaussianErrorModel. These are exactly the same as the configurations of the parent class

Parameters:
  • verbose (bool) – a boolean flag to control verbosity of the object.

  • debug (bool) – a boolean flag that enables adding extra functionlity in a debug mode

  • output_dir (str | Path) – the base directory where the output files will be saved.

  • size (int | None) – dimension of the error model space. Detected from mean if None.

  • mean (float | ndarray) – mean of the error model

  • variance (float | ndarray | spmatrix) – variance/covariance of the error model

  • design (None | bool | Sequence[bool] | ndarray) –

    an experimental design to define active/inactive entries of the random variable (mean, variance/covariance matrix).

    • If the design is None, it is set to all ones; that is everything is observed (default)

    • If the design is a binary vector ( or int dtype attributes with 0/1 entries) the mean, the covariance, and all random vectors are projected onto the space identified by the 1/True entries.

  • sparse (bool) – convert covariance to scipy.csc_matrix used if size > 1

  • random_seed (int | None) – random seed used when the Gaussian model is initiated. This is useful for reproductivity.

  • design_weighting_scheme (str) –

    a string describing the weighting scheme; you can choose from the following options:

    • ’covariance’: the covariance matrix is pre- and post-multiplied by the design matrix defined as a diagonal matrix with the design on its diagonal.

    • ’precision’: the precision (inverse of the covariance) matrix is pre- and post-multiplied by the design matrix defined as a diagonal matrix with the design on its diagonal.

design_weighting_scheme: str#
__init__(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', size=None, mean=0.0, variance=1.0, design=True, sparse=True, random_seed=None, design_weighting_scheme='covariance')#
class PrePostWeightedGaussianErrorModel(configs=None)[source]#

Bases: GaussianErrorModel

A simple Numpy-based (or scipy sparse) Gaussian error model with pre- and post-multiplicative design weighting. Specifically, the design (binary or relaxed) takes place by pre- and post-multiplication of the covariance or the precision matrix (based on the design_weighting_scheme in the configurations) by a diagonal matrix with the design set to its diagonal (the design matrix here).

As a clarifying example, consider the A-optimal design for linear inverse problems. In this case, the posterior covariance \(\Gamma_{\rm post}\) takes on the following forms:

  1. design_weighting_scheme==’covariance’: pre-post multiplication of the covariance matrix). The covariance of the Gaussian posterior is:

    \[\Gamma_{\rm post} = \left( \mathbf{F}^{T} \left( diag(\mathbf{d}) \Gamma_{\rm noise} diag(\mathbf{d}) \right)^{-1} \mathbf{F} + \Gamma_{\rm prior}^{-1} \right)^{-1} \,,\]
  2. design_weighting_scheme==’precision’: pre-post imultiplication of the precision matrix). The covariance of the Gaussian posterior is:

    \[\Gamma_{\rm post} = \left( \mathbf{F}^{T} diag(\mathbf{d}) \Gamma_{\rm noise}^{-1} diag(\mathbf{d}) \mathbf{F} + \Gamma_{\rm prior}^{-1} \right)^{-1} \,,\]

where \(\mathbf{F}\) is the forward model (e.g., parameter-to-observable map), \(\mathbf(d)\) is the relaxed/binary design vector, \(diag(\mathbf{d})\) is a diagonal matrix with \(\mathbf{d}\) on its main diagonal, \(\Gamma_{\rm prior}\) is the covariance matrix of the Gaussian prior, and \(\Gamma_{\rm noise}\) is the covariance matrix of the Gaussian observation noise/error model.

Thus, in this formulation/implementation, the effect of the design takes place on the covariance/precision matrix.

Note

This implementation is not accurate for correlated covariances as it creates discontinuity at the corner (binary) points. Though, we add it for comparison as it has been utilized in the past in the OED literature, and the implementation is simpler and less computationally expensive.

Parameters:

configs (PrePostWeightedGaussianErrorModelConfigs | dict | None) – an object containing configurations of the error model

Raises:
__init__(configs=None)[source]#

Initialize the random number generator

validate_configurations(configs, raise_for_invalid=False)[source]#

Validate the passed configurations for the Gaussian error model

Parameters:
  • configs (dict | PrePostWeightedGaussianErrorModelConfigs) – configurations to validate. If a configs dataclass is passed validation is performed on the entire set of configurations. However, if a dictionary is passed, validation is performed only on the configurations corresponding to the keys in the dictionary.

  • raise_for_invalid (bool) – raise an exception if the configurations are invalid

Returns:

True if the configurations are valid, otherwise False

Raises:
standardize_configurations(configs, mean_dtype=<class 'float'>, variance_dtype=<class 'float'>, design_dtype=<class 'float'>)[source]#

Check the configurations of the error model (size, mean, design, and variance), and make sure they are consistent.

Note

Here are the rules:

  1. If the size is passed, the mean and the design must be one-dimensional arrays/vectors of this size (if either is scalar, they are casted to vectors), and the variance’s size/shape is asserted accordingly.

  2. If the size is not passed, it is inferred from mean and/or variance

Raises:

PyOEDConfigsValidationError – if the validation of the configurations fail

update_design(design)[source]#

If the design matches existing one, don’t do anything, otherwise, update it

Parameters:

design – the design to be used update model’s configurations

Raises:

PyOEDConfigsValidationError – if the passed design is of invalid type/shape/size

update_configurations(**kwargs)[source]#

Take any set of keyword arguments, and lookup each in the configurations, and update as nessesary/possible/valid

Raises:

TypeError – if any of the passed keys in kwargs is invalid/unrecognized

update_design_weighting_scheme(design_weighting_scheme)[source]#

Update the design weighting scheme.

Parameters:

design_weighting_scheme (str) – name of the design weighting scheme; expected ‘covariance’ or ‘precision’

Raises:

PyOEDConfigsValidationError – if the scheme is invalid

apply_weighting_matrix(variance, design=None)[source]#

Apply the design weighting matrix by pre- and post-multiplying the passed variance matrix with the passed design or the registered one.

Parameters:
  • variance – the variance covariance matrix

  • design – an experimental design used to calculate the weights. This is only added for flexibility. If None, the internal/current deesign is used.

Returns:

variance pointwise multiplied by the underlying weighting matix

property design_weighting_scheme#

Name of the design weighting scheme

property stdev#

Construct and return the lower Cholesky factor (if not available) in the projected space (after applying the design) of the covariance matrix

Note

The update of the covariance Cholesky factor based on the weighting scheme is as follows:

  1. design_weighting_scheme is ‘covariance’:

    \[C \gets D C D^T; \qquad C \gets D (L L^T) D^T = (DL) (DL)^T \,.\]

    Thus, the covariance matrix is projected onto the active subspace, and the lower Cholesky factor is calculated there, and then weights are applied.

  2. design_weighting_scheme is ‘precision’:

    \[C^{-1} \gets D C^{-1} D; C \gets ( D (L L^T)^{-1} D )^{-1}; \qquad\]

    where the inverse is replaced with pseudo inverse when the design includes inactive entries. Here, the Lower Cholesky factor is calculated for the full covariance matrix (by ignoring the design and including inactive entries), and then the lower factor is projected onto the active subspace and is then weighted by the inverse of the active design entries (weights).

Gaussian Error Models with Pointwise-Weighted Design#

This module implements weighted versions of Gaussian error model with support for binary as well as relaxed (non-binary) design. The role of a design is to activate/deactivate entries (dimensions) if it is binary, however, if it is relaxed, the design applies some weights to the entries of the covariance (and the precision) matrix based on the formulation approach taken.

The approach taken here is pre-post multiplication. This means either the covariance matrix (or its inverse) is multiplied (pre -> from left, and post -> from right) by a diagonal matrix with the design vector (relaxed) on its main diagonal.

Note

This is the right approach to applying relaxed design for sensor placement in inverse problems. For details of the approach, see:

  1. Ahmed Attia, and Emil Constantinescu. “Optimal experimental design for inverse problems in the presence of observation correlations.” SIAM Journal on Scientific Computing 44, no. 4 (2022): A2808-A2842.

class PointwiseWeightedGaussianErrorModelConfigs(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', size=None, mean=0.0, variance=1.0, design=True, sparse=True, random_seed=None, design_weighting_scheme='covariance', pointwise_weighting_kernel=<function PointwiseWeightedGaussianErrorModelConfigs.<lambda>>)[source]#

Bases: PrePostWeightedGaussianErrorModelConfigs

Configuration settings for PointwiseWeightedGaussianErrorModel. These are exactly the same as the configurations of the parent class

Parameters:
  • verbose (bool) – a boolean flag to control verbosity of the object.

  • debug (bool) – a boolean flag that enables adding extra functionlity in a debug mode

  • output_dir (str | Path) – the base directory where the output files will be saved.

  • size (int | None) – dimension of the error model space. Detected from mean if None.

  • mean (float | ndarray) – mean of the error model

  • variance (float | ndarray | spmatrix) – variance/covariance of the error model

  • design (None | bool | Sequence[bool] | ndarray) –

    an experimental design to define active/inactive entries of the random variable (mean, variance/covariance matrix).

    • If the design is None, it is set to all ones; that is everything is observed (default)

    • If the design is a binary vector ( or int dtype attributes with 0/1 entries) the mean, the covariance, and all random vectors are projected onto the space identified by the 1/True entries.

  • sparse (bool) – convert covariance to scipy.csc_matrix used if size > 1

  • random_seed (int | None) – random seed used when the Gaussian model is initiated. This is useful for reproductivity.

  • design_weighting_scheme (str) –

    a string describing the weighting scheme; you can choose from the following options:

    • ’covariance’: the covariance matrix is pointwise-multiplied by the weighting matrix constructed (elementwise) by useing the passed pointwise_weighting_kernel.

    • ’precision’: the precision (inverse of the covariance) matrix is

      pointwise-multiplied by the weighting matrix constructed (elementwise) by useing the passed pointwise_weighting_kernel.

  • pointwise_weighting_kernel (Callable[[ndarray, int, int], float]) – a callable (function) that accpts three arguments (design, i, j) and evaluates a weighting value; used only when design_weighting_scheme is ‘pointwise-covariance’ or ‘pointwise-precision’

pointwise_weighting_kernel: Callable[[ndarray, int, int], float]#
__init__(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', size=None, mean=0.0, variance=1.0, design=True, sparse=True, random_seed=None, design_weighting_scheme='covariance', pointwise_weighting_kernel=<function PointwiseWeightedGaussianErrorModelConfigs.<lambda>>)#
class PointwiseWeightedGaussianErrorModel(configs=None)[source]#

Bases: PrePostWeightedGaussianErrorModel

A simple Numpy-based (or scipy sparse) Gaussian error model with pre- and post-multiplicative design weighting. Specifically, the design (binary or relaxed) takes place by pre- and post-multiplication of the covariance or the precision matrix (based on the design_weighting_scheme in the configurations) by a diagonal matrix with the design set to its diagonal (the design matrix here).

As a clarifying example, consider the A-optimal design for linear inverse problems. In this case, the posterior covariance \(\Gamma_{\rm post}\) takes on the following forms:

  1. design_weighting_scheme==’covariance’: pre-post multiplication of the covariance matrix). The covariance of the Gaussian posterior is:

    \[\Gamma_{\rm post} = \left( \mathbf{F}^{T} \left( \Gamma_{\rm noise} \cdot \mathbf{W} \right)^{\dager} \mathbf{F} + \Gamma_{\rm prior}^{-1} \right)^{-1} \,,\]
  2. design_weighting_scheme==’precision’: pre-post imultiplication of the precision matrix). The covariance of the Gaussian posterior is:

    \[\Gamma_{\rm post} = \left( \mathbf{F}^{T} \left( \mathbf{W} \cdot \Gamma_{\rm noise}^{-1} \right) \mathbf{F} + \Gamma_{\rm prior}^{-1} \right)^{-1} \,,\]

where \(\mathbf{F}\) is the forward model (e.g., parameter-to-observable map), \(\mathbf(W)\) is the weighting matrix constructed elementwise by using the pointwise_weighting_kernel in the configurations. \(\Gamma_{\rm prior}\) is the covariance matrix of the Gaussian prior, and \(\Gamma_{\rm noise}\) is the covariance matrix of the Gaussian observation noise/error model.

Thus, in this formulation/implementation, the effect of the design takes place on the covariance/precision matrix.

Warning

The value “precision” of the design weighting scheme here is not mathematically correct, and is only added for testing and validation.

Parameters:

configs (PointwiseWeightedGaussianErrorModelConfigs | dict | None) – an object containing configurations of the error model

Raises:
__init__(configs=None)[source]#

Initialize the random number generator

validate_configurations(configs, raise_for_invalid=False)[source]#

Validate the passed configurations for the Gaussian error model

Parameters:
  • configs (dict | PointwiseWeightedGaussianErrorModelConfigs) – configurations to validate. If a configs dataclass is passed validation is performed on the entire set of configurations. However, if a dictionary is passed, validation is performed only on the configurations corresponding to the keys in the dictionary.

  • raise_for_invalid (bool) – raise an exception if the configurations are invalid

Returns:

True if the configurations are valid, otherwise False

Raises:
build_weighting_matrix(design=None, full_space=False)[source]#

Use the registered weighting kernel to build the weighting matrix. If no design is passed, the current design is used.

Parameters:
  • design – an experimental design used to calculate the weights. This is only added for flexibility. If None, the internal/current deesign is used.

  • full_space – if True, the weighting matrix will include weights corresponding to non-zero entries only. If False, the weights are calculated for active entries only.

apply_weighting_matrix(variance, design=None)[source]#

Apply (poinwise) the weighting matrix W created by calling build_weighting_matrix() to the passed variance/covariance matrix variance. The weighting is carried out in the full or projected (based on design/active) space based on the dimension of variance.

Parameters:
  • variance – the variance covariance matrix

  • design – an experimental design used to calculate the weights. This is only added for flexibility. If None, the internal/current deesign is used.

Returns:

variance pointwise multiplied by the underlying weighting matix

property pointwise_weighting_kernel#

Name of the design weighting scheme

property stdev#

Construct and return the lower Cholesky factor (if not available) in the projected space (after applying the design) of the covariance matrix

Note

The update of the covariance Cholesky factor based on the weighting scheme is as follows:

  1. design_weighting_scheme is ‘covariance’:

    \[C \gets W \codt C \,.\]

    Thus, the covariance matrix is projected onto the active subspace, and the lower Cholesky factor is calculated there, and then weights are applied.

  2. design_weighting_scheme is ‘precision’:

    \[C \gets \left( W \codt C^{-1} \right)^{\dagger} \,.\]

Gaussian Error Models for Fenics (Dolfin) Developments#

This module implements a Gaussian error model for support of fenics (dolfin) models.

class DolfinGaussianErrorModelConfigs(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', Vh=None, mean=0.0, variance=1.0, random_seed=None)[source]#

Bases: ErrorModelConfigs

Configuration settings for DolfinGaussianErrorModel.

Parameters:
  • verbose (bool) – a boolean flag to control verbosity of the object.

  • debug (bool) – a boolean flag that enables adding extra functionlity in a debug mode

  • output_dir (str | Path) – the base directory where the output files will be saved.

  • Vh (None | Any) – finite element space to which the distribution is assocciated

  • mean (float | ndarray) – mean of the error model

  • variance (float | ndarray | spmatrix) – variance/covariance of the error model

  • random_seed (int | None) – random seed used when the Gaussian model is initiated. This is useful for reproductivity.

Vh: None | Any#
mean: float | ndarray#
variance: float | ndarray | spmatrix#
random_seed: int | None#
__init__(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', Vh=None, mean=0.0, variance=1.0, random_seed=None)#
class DolfinGaussianErrorModel(configs=None)[source]#

Bases: ErrorModel, RandomNumberGenerationMixin

This class provides an implementation of Gaussian Models suitable for Dolfin-based formulation of forward and inverse problems. The Gaussian distribution is on the form \(\mathcal{N}(\mathbf{x}, \mathbf{B})\), with \(\mathbf{x}\) being the mean, and \(\mathbf{B}\) being the variance/covariance matrix/operator \(\mathbf{B}\). The underlying finite element space is assumed to be \(\mathtt{R}\).

Parameters:

configs (DolfinGaussianErrorModelConfigs | dict | None) – the configurations object:

Raises:
  • ImportError: if dolfin or ufl are not available/installed

  • TypeError: if the passed mean or variance types are not supported

  • ValueError: if no valid FEM space (DOF) Vh is found in configs

__init__(configs=None)[source]#

Initialize the random number generator

update_mean(mean)[source]#

Update the mean of the distribution

update_covariance_matrix(cov)[source]#

Define the Variational forms of the covariance, precision, and factors of the covariance matrix, then associate the passed covariance matrix

update_design(design)[source]#

Update the design vector (which entries are active)

create_vector(return_np=False, ignore_design=False)[source]#

Create a vector compatible with the range/domain of the underlying covariance matrix

generate_white_noise(ignore_design=False, truncate=False, return_np=False)[source]#
Generate a standard normal random vector of size size with values truncated

at -/+3 if truncate is set to True

Parameters:
  • ignore_design (bool) – the size of the white noise vector is in the full (ignore the design) if True, otherwise the size matches the dimension of the active subspace.

  • truncate (bool) – if True, truncate the samples at -/+3, that is any sample point/entry above 3 is set to 3, and any value below -3 is set to -3.

  • return_np (bool) – if True, convert the result into a numpy array (from fenics vector)

Returns:

numpy array (even if the dimension is 1) containing white noise

add_noise(x, in_place=False)[source]#

Add random noise to the passed vector x

Parameters:
  • x – vector in a space of dimension equal to size of the underlying error model

  • in_place (bool) – overwrite the passed vector x (if True). If set to False, a new instance (based on the distribution size) is created and returned. This is ignored of course if x is a non-iterable, i.e., a scalar.

Raises:

TypeError if x is of the wrong shape or type

generate_noise(return_np=False, ignore_design=False)[source]#

Generate a random noise vector sampled from the underlying Gaussian distribution

sample(return_np=False, ignore_design=False)[source]#

Sample a random vector from the underlying Gaussian distribution

pdf(x, normalize=False, log=False)[source]#

Evaluate the value of the density function (normalized or upto a fixed scaling constant) at the passed state/vector.

Parameters:
  • x – realization of the random variable

  • normalize (bool) – scale the PDF by the normalization factor

  • log (bool) – if True, evaluate the gradient of the logarithm of the PDF

Returns:

the value of the density function evaluated at the passed state/vector.

pdf_gradient(x, normalize=False, log=False)[source]#

Evaluate the gradient of the density function at the passed state/vector.

Parameters:
  • x – realization of the random variable

  • normalize (bool) – scale the PDF by the normalization factor

  • log (bool) – if True, evaluate the gradient of the logarithm of the PDF

Returns:

the value of the density function evaluated at the passed state/vector.

log_density(x, normalize=False)[source]#

Evaluate the logarithm of the density function at the passed state x.

Parameters:
  • x – realization of the random variable

  • normalize (bool) – scale the PDF by the normalization factor

Returns:

the logarithm of the density function evaluated at the passed state.

log_density_gradient(x, normalize=False)[source]#

Evaluate the gradient of the logarithm of the density function at the passed state x.

Parameters:
  • x – realization of the random variable

  • normalize (bool) – scale the PDF by the normalization factor

Returns:

the gradient of the logarithm of the density function evaluated at the passed state.

covariance_sqrt_matvec(x, lower=True, return_np=False, in_place=False, ignore_design=False)[source]#

Evaluate and return the product of Square root (Lower Cholesky) covariance matrix by a vector x. The product is determined by applying the cholesky decomposition of the covariance matrix once (dot product) to the passed state x

Parameters:
  • x – vector in a space of dimension equal to the size of the underlying error model. x` could be scalar or number array, however, it must match the distribution size.

  • lower (bool) – use the lower Cholesky as the square root

  • in_place (bool) – overwrite the passed vector x (if True). If set to False, a new instance (based on the distribution size) is created and returned. This is ignored of course if x is a non-iterable, i.e., a scalar.

Returns:

the product of the sqrt-covariance matrix by x

covariance_matvec(x, return_np=False, in_place=False, ignore_design=False)[source]#

Evaluate and return the product of covariance matrix by a vector x. The product is determined by applying the cholesky decomposition of the covariance matrix twice (dot product) to the passed state x Specifically, we assume the covariance matrix is on the form \(\mathbf{L} \mathbf{S} \mathbf{S}^T \mathbf{L}^T\), where \(\mathbf{L}\) is the design matrix (fat-short matrix), and \(\mathbf{S}\) is the lower Cholesky factor of the covariance matrix.

Parameters:
  • x – vector in a space of dimension equal to the size of the underlying error model. x could be scalar or number array, however, it must match the distribution size.

  • in_place (bool) – overwrite the passed vector x (if True). If set to False, a new instance (based on the distribution size) is created and returned. This is ignored of course if x is a non-iterable, i.e., a scalar.

Returns:

the product of the covariance matrix by x

covariance_diagonal(ignore_design=False)[source]#

Return a copy the diagonal of the covariance matrix, i.e., the variances

Returns:

1D numpy array holding the diagonal of the underlying covariance matrix. If the size of the underlying distribution is 1, this returns a scalar (variance)

variance(ignore_design=False)[source]#

Same as covariance_diagonal()

covariance_inv_matvec(x, in_place=False, return_np=False, ignore_design=False)[source]#

Evaluate and return the product of inverse of the error covariance matrix by a vector x. A linear system is solved using the cholesky decomposition of the covariance matrix

Parameters:
  • x – vector in a space of dimension equal to the size of the underlying error model. x could be scalar or number array, however, it must match the distribution size.

  • in_place (bool) – overwrite the passed vector x (if True). If set to False, a new instance (based on the distribution size) is created and returned. This is ignored of course if x is a non-iterable, i.e., a scalar.

Returns:

the product of inverse of the error covariance matrix by x

Raises:

ValueError if solving the linear system using LGMRES fails

covariance_sqrt_inv_matvec(x, lower=True, return_np=False, in_place=False, ignore_design=False)[source]#

Evaluate and return the product of inverse of the square root of the error covariance matrix by a vector x. A linear system is solved using the cholesky decomposition of the covariance matrix

Parameters:
  • x – vector in a space of dimension equal to the size of the underlying error model. x could be scalar or number array, however, it must match the distribution size.

  • lower (bool) – use the lower Cholesky as the square root

  • in_place (bool) – overwrite the passed vector x (if True). If set to False, a new instance (based on the distribution size) is created and returned. This is ignored of course if x is a non-iterable, i.e., a scalar.

Returns:

the product of inverse of the square root of the error covariance matrix by x

Raises:

ValueError if solving the linear system using LGMRES fails

covariance_matrix(ignore_design=False)[source]#

Construct a numpy array representation (if the dimension is more than 1) of the underlying covariance matrix. (projected onto the observation space)

precision_matrix(ignore_design=False)[source]#

Construct a numpy array representation (if the dimension is more than 1) of the precision matrix (inverse of the underlying covariance matrix).

covariance_trace(ignore_design=False, method='exact', sample_size=50, optimize_sample_size=False, min_sample_size=10, max_sample_size=100, distribution='Rademacher', accuracy=0.01)[source]#

Evaluate/Approximate the trace of the covariance matrix

Note

This is a wrapper around the utility function pyoed.utility.math.matrix_trace that is called with the method covariance_matvec() as the operator to compute the trace of.

Parameters:
  • ignore_design (bool) – the size of the white noise vector is in the full (ignore the design) if True, otherwise the size matches the dimension of the active subspace.

  • method – Method of evaluation. If “exact”, we simply sum the diagonal of A. If “randomized”, a Hutchinson style trace estimator, as described in pyoed.utility.trace_estimator(), is employed. All of the futher keyword arguments are in the case of method=”randomized” and are otherwise ignored if method=”exact”.

  • randomization_vectors – The randomization vectors drawn to perform the computation. If not passed, they will be sampled.

  • sample_size (int) – The number of samples to use. :param optimize_sample_size: if True, sample_size is replaced with an optimized version, where the optimal sample size minimizes the variance of the estimator and is between min_sample_size, and max_sample_size.

  • min_sample_size (int) – Minimum number of random samples.

  • max_sample_size (int) – Maximum number of random samples.

  • distribution (str) – The probability distribution used for random sampling. Both ‘Rademacher’ and ‘Gaussian’ are supported.

  • random_seed – The random seed to use for the distribution samples.

  • accuracy (float) – convergence criterion.

Returns:

exact/approximate value of the trace of the covariance matrix. of the underlying model.

Return type:

float

covariance_logdet(ignore_design=False, method='exact', sample_size=50, optimize_sample_size=False, min_sample_size=10, max_sample_size=100, distribution='Rademacher', accuracy=0.01)[source]#

Evaluate/approximate the logarithm of the determinant of the covariance matrix of the passed error model

Note

This is a wrapper around the utility function pyoed.utility.math.matrix_logdet that is called with the method covariance_matvec() as the operator to compute the trace of.

Parameters:
  • ignore_design (bool) – the size of the white noise vector is in the full (ignore the design) if True, otherwise the size matches the dimension of the active subspace.

  • method – Method of evaluation. If “exact”, we simply sum the diagonal of A. If “randomized”, a chebyshev-approximation type-method is employed from pyoed.utility.logdet_estimator(). All of the futher keyword arguments are in the case of method=”randomized” and are otherwise ignored if method=”exact”.

  • randomization_vectors – The randomization vectors drawn to perform the computation. If not passed, they will be sampled.

  • sample_size (int) – The number of samples to use. :param optimize_sample_size: if True, sample_size is replaced with an optimized version, where the optimal sample size minimizes the variance of the estimator and is between min_sample_size, and max_sample_size.

  • min_sample_size (int) – Minimum number of random samples.

  • max_sample_size (int) – Maximum number of random samples.

  • distribution (str) – The probability distribution used for random sampling. Both ‘Rademacher’ and ‘Gaussian’ are supported.

  • random_seed – The random seed to use for the distribution samples.

  • accuracy (float) – convergence criterion.

Returns:

exact/approximate value of the logdet of the covariance matrix of the underlying model.

Return type:

float

kl_divergence(other)[source]#
property size#

Dimension of the underlying probability distribution

property design#

Get a copy of the design vector

property extended_design#

Return a vector indicating active/inactive entries (with proper replication for multiple prognostic variables) of the observation vector/range

property active#

flags corresponding to active/inactive dimensions

active_size()[source]#

Dimension of the probability distribution projected onto the design space (only active/nonzero entries of the design)

property mean#

Get a copy of the distribution mean

Helper Functions#

This module provides access to several flavors of Gaussian Error Models with various formulations of the design effect.

kl_divergence(N0, N1, method='exact', sample_size=50, optimize_sample_size=False, min_sample_size=10, max_sample_size=100, distribution='Rademacher', random_seed=None, accuracy=0.01)[source]#

Calculate the KL divergence between two Gaussian distributions as follows,

\[D_{\text{KL}}\left({\mathcal {N}}_{0}\parallel {\mathcal {N}}_{1}\right) ={\frac {1}{2}}\left(\operatorname {tr} \left(\Sigma_{1}^{-1}\Sigma_{0}\right) +\left(\mu_{1}-\mu_{0}\right)^{\mathsf{T}}\Sigma_{1}^{-1}\left(\mu_{1}-\mu_{0}\right) -k+\ln \left({\frac {\det \Sigma_{1}}{\det \Sigma_{0}}}\right)\right)\]
create_GaussianErrorModel(size, mean=0.0, variance=1.0, sparse=False, random_seed=None)[source]#
create_PrePostWeightedGaussianErrorModel(size, mean=0.0, variance=1.0, sparse=False, random_seed=None)[source]#
create_PointwiseWeightedGaussianErrorModel(size, mean=0.0, variance=1.0, sparse=False, random_seed=None)[source]#