Math Utility Functions#

This module provides access to functions to perform various linear algebra, calculus etc. operations.

validate_inverse_problem_derivatives(problem, eval_at, gradient=None, validate_gradient=True, validate_Hessian=True, data_misfit_only=False, method='fd', fd_eps=None, use_noise=True, create_plots=False, plots_format='pdf', random_seed=1011, output_dir=None)[source]#

Given a problem (filtering/smoothing instance), use finite differences to validate the gradient and/or the Hessian of the objective function associated with the passed problem; the problem must provide the following methods:

  1. objective_function_value() to calculate the objective function value and the gradient at a given state/parameter;

  2. objective_function_gradient() to calculate the first-order derivative (gradient) of the objective function at a given state/parameter eval_at;

  3. Hessian_matvec() to evaluate the product of the second-order derivative (Hessian) of the objective function at a given state/parameter eval_at;

Parameters:
  • problem – problem (filtering/smoothing)

  • eval_at – state/parameter to evaluate derivatives at

  • validate_gradient (bool) – apply validation for first order-derivative

  • validate_Hessian (bool) – apply validation for first order-derivative

  • data_misfit_only (bool) – ignore any prior term in the objective function if True

  • method (str) –

    the method used to validate derivatives. Supported methods are

    • ’fd’: standard (right-sided) finite-difference validation

    • ’fd-central’: double-sided finite-difference validation

  • fd_eps

    epsilon used for finite difference validation; automatically chosen if None

    • if use_noise is True, a range from 1e-12 to 1 is sampled

    • if use_noise is False, 1e-7 is used

  • use_noise (bool) – use white noise (multiplied by derivatives) to validate gradients and create log-log plots (if create_plots is True)

  • output_dir – path to folder/location to save plots (if create_plots ` is `True)

  • random_seed – random seed to feed random number generator for reproducibility

Returns:

relative errors w.r.t FD approximation; size of returned items vary based the used approach (use noise vs. not) - grad_err: relative error of the gradient comapred to finite differences - Hessian_err: relative error of Hessian matvec effect comapred to finite differences

Raises:
  • ValueError is raised if both validate_gradient and validate_Hessian are set to False

  • ValueError is raised if method is not supported/recognized

  • TypeError is raised if the type of fd_eps is not supported

inverse_problem_linearity_test(inverse_problem)[source]#

Check if the passed inverse problem is linear not by testing the forward model and the observation operator. The inverse problem is tested for linearity by testing linearity of:

  1. The observation operator: each observation operator must provide Jacobian_T_matvec() method which requires valid (not None) value of eval_at argument. If it is possible to evaluate this method without eval_at then the observation operator is linear, otherwise it throws a ValueError which indicates the operator is nonlinear.

  2. The simulation model: similar to the observation operator, but the model linearity is test through one of Jacobian_T_matvec() or solve_adjoint() based one which one is available.

If none of the methods above is available where expected, this method throws a TypeError

Returns:

a flag indicating whether the underlying inverse/OED problem is linear.

Raises:

TypeError if the methods to be tested are not available

validate_function_gradient(fun, state, gradient, fd_eps=1e-08, bounds=None, verbose=False, fd_central=False, fd_complex=False)[source]#

Validate the gradient of an objective function fun using finite differences. Complex finite differences are supported, and utilized if fd_complex is True

Parameters:
  • fun (Callable) – The objective function to validate its gradient

  • state – The state/parameter at which to validate the gradient

  • gradient – The gradient to validate

  • fd_eps (float) – The finite difference perturbation. Default is 1e-8.

  • bounds – either None, or a list of 2-sized iterables on the form [(lb, ub), (lb, ub), ….] defining lower bounds lb and upper bounds ub of each entry of hte state. The length of bounds must be equal to the size of state.

  • verbose (bool) – If True, print validation statistics.

  • fd_central (bool) – If True, use central finite differences. Default is False, and therefore standard (right-sided) finite differences are used.

  • fd_complex (bool) – If True, use complex finite differences. Default is False.

Warning

Complex finite differences are not yet implemented!

Raises:
  • TypeError – If fun is not callable.

  • AssertionError – If state and gradient are not of equal sizes.

  • AssertionError – If fd_eps is not a real number.

  • AssertionError – If fd_eps is zero.

  • NotImplementedError – If fd_complex is True, as complex finite differences are not implemented yet.

  • ValueError – If the finite difference throws an exception for some reason.

Returns:

A tuple containing a numpy array of the finite difference gradient of fun at state and its corresponding gradient error.

Return type:

tuple[np.ndarray, np.ndarray]

finite_differences_gradient(fun, state, fd_eps=1e-08, bounds=None, verbose=False, fd_central=False, fd_complex=False)[source]#

Evaluate the gradient of an objective function fun using finite differences. Complex finite differences are supported, and utilized if fd_complex is True

Parameters:
  • fun (Callable) – The objective function to validate its gradient

  • state – The state/parameter at which to validate the gradient

  • fd_eps (float) – The finite difference perturbation. Default is 1e-8.

  • bounds – either None, or a list of 2-sized iterables on the form [(lb, ub), (lb, ub), ….] defining lower bounds lb and upper bounds ub of each entry of hte state. The length of bounds must be equal to the size of state.

  • verbose (bool) – If True, print validation statistics.

  • fd_central (bool) – If True, use central finite differences. Default is False, and therefore standard (right-sided) finite differences are used.

  • fd_complex (bool) – If True, use complex finite differences. Default is False.

Warning

Complex finite differences are not yet implemented!

Raises:
  • TypeError – If fun is not callable.

  • AssertionError – If fd_eps is not a real number.

  • AssertionError – If fd_eps is zero.

  • NotImplementedError – If fd_complex is True, as complex finite differences are not implemented yet.

  • TypeError – If bounds are passed (not None) but have wrong shape/format

  • ValueError – If the finite difference throws an exception for some reason.

Returns:

A numpy array of the finite difference gradient of fun at state

Return type:

np.ndarray

asarray(A, shape=None, dtype=None)[source]#

Return a copy of the passed array-like object as a numpy array. This requires the input to by scalar, numpy array, scipy sparse matrix, or a callable/linear operator.

Parameters:
  • A (float | np.ndarray | sp.spmatrix | abc.Callable[np.ndarray]) – The array-like object to be converted to a numpy array.

  • shape – The shape of the array to be returned. This is only used if A is a callable/linear operator.

Raises:
  • TypeError – If A is not a scalar, numpy array, scipy sparse matrix, or a collable/linear operator.

  • ValueError – If A is a callable/linear operator, and shape is not given.

Return type:

np.ndarray

matrix_trace(A, size=None, method='exact', randomization_vectors=None, return_randomization_vectors=False, sample_size=100, optimize_sample_size=False, min_sample_size=10, max_sample_size=100, distribution='Rademacher', random_seed=None, accuracy=0.01)[source]#

Evaluate/approximate the trace of a given matrix or operator

Parameters:
  • A (ndarray | spmatrix | Callable[[ndarray], ndarray]) – Linear operator to compute trace of.

  • size (int | None) – size of the domain of A; required if A is callable.

  • method (str) – Method of evaluation. If “exact”, we simply sum the diagonal of A. If “randomized”, a Hutchinson style trace estimator, as described in pyoed.utility.trace_estimator(), is employed. All of the futher keyword arguments are in the case of method=”randomized” and are otherwise ignored if method=”exact”.

  • randomization_vectors (list[ndarray] | None) – The randomization vectors drawn to perform the computation. If not passed, they will be sampled.

  • sample_size (int) – The number of samples to use. :param optimize_sample_size: if True, sample_size is replaced with an optimized version, where the optimal sample size minimizes the variance of the estimator and is between min_sample_size, and max_sample_size.

  • min_sample_size (int) – Minimum number of random samples.

  • max_sample_size (int) – Maximum number of random samples.

  • distribution (str) – The probability distribution used for random sampling. Both ‘Rademacher’ and ‘Gaussian’ are supported.

  • random_seed – The random seed to use for the distribution samples.

  • accuracy (float) – convergence criterion.

Returns:

exact/approximate value of the trace of A. If return_randomization_vectors is True it also, additionally, returns the samples drawn during the randomized method.

Return type:

float | tuple[float, list[ndarray]]

matrix_logdet(A, size=None, method='exact', randomization_vectors=None, return_randomization_vectors=False, sample_size=100, optimize_sample_size=False, min_sample_size=10, max_sample_size=100, distribution='Rademacher', random_seed=None, accuracy=0.01, sigma_bounds=[None, None], n_deg=14)[source]#

Evaluate/approximate the log-det of a given matrix or operator

Parameters:
  • A – Linear operator to compute logdet of.

  • size – size of the domain of A; required if A is callable.

  • method – Method of evaluation. If “exact”, we simply sum the diagonal of A. If “randomized”, a chebyshev-approximation type-method is employed from pyoed.utility.logdet_estimator(). All of the futher keyword arguments are in the case of method=”randomized” and are otherwise ignored if method=”exact”.

  • randomization_vectors (list[ndarray] | None) – The randomization vectors drawn to perform the computation. If not passed, they will be sampled.

  • sample_size (int) – The number of samples to use. :param optimize_sample_size: if True, sample_size is replaced with an optimized version, where the optimal sample size minimizes the variance of the estimator and is between min_sample_size, and max_sample_size.

  • min_sample_size (int) – Minimum number of random samples.

  • max_sample_size (int) – Maximum number of random samples.

  • distribution (str) – The probability distribution used for random sampling. Both ‘Rademacher’ and ‘Gaussian’ are supported.

  • random_seed – The random seed to use for the distribution samples.

  • accuracy (float) – convergence criterion.

Returns:

exact/approximate value of the logdet of the covariance matrix of the underlying model. If return_randomization_vectors is True, instead returns tuple of the form (logdet, randomization_vectors)

Return type:

float | tuple[float, list[ndarray]]

matrix_diagonal(A, size=None)[source]#

Return a copy the diagonal of a given matrix, i.e., the variances for a covariance matrix.

Parameters:
  • A (np.ndarray | sp.spmatrix | abc.Callable[[np.ndarray],np.ndarray]) – Matrix to compute diagonal of.

  • size (int) – shape of A; required if A is callable. Default is None.

Raises:

ValueError – if A is callable and size is not passed.

Returns:

np.ndarray holding the diagonal of the underlying matrix. If the size of the underlying distribution is 1, this returns a scalar (variance)

Return type:

np.ndarray | float

matrix_vecmult(A, x, in_place=False)[source]#

Evaluate the product of a matrix by a vector x.

Parameters:
  • A (np.ndarray | sp.spmatrix | abc.Callable[[np.ndarray],np.ndarray]) – Operator to apply to x.

  • sp.spmatrix (float | np.ndarray x |) – vector to act on.

  • in_place (bool) – overwrite the passed vector x (if True). If set to False, a new instance is created and returned. Default is False.

Note

in_place is ignored if x is a non-iterable, i.e., a scalar.

Returns:

Ax

Return type:

np.ndarray

factorize_spsd_matrix(A, shape=None, lower=True)[source]#

Genereate Cholesky decomposition of the passed symmetric positive semidefinite matrix.

Parameters:
  • A (float | np.ndarray | sp.spmatrix | abc.Callable[[np.ndarray],np.ndarray]) – matrix to factorize.

  • shape (tuple[int]) – shape of the matrix A. Necessary if type(A)==abc.Callable[np.ndarray]. Default is None.

  • lower (bool) – whether to return the lower triangular portion of the Cholesky factorization. Default is True.

Returns:

Square root of A if scalar, otherwise, lower triangular portion of the matrix Cholesky decomposition. Return type is same as A type unless a callable is passed. In this case, a scipy linear operator is returned.

Raises:

TypeError – if A is not of the specified types.

truncate(x, lb=0, ub=1)[source]#

Return a numpy array representation of the passed object x with values truncated based on the lower bound lb and the upper bound ub. All values in x below lb are replaced with lb and all values above ub are replaced with ub

Parameters:
  • x (float | np.ndarray) – object to truncate.

  • lb (float) – lower bound. Default is 0.

  • ub (float) – upper bound. Default is 1.

Returns:

np.ndarray representation of x with truncated values.

remainder(x, y, num_decimals=100)[source]#

Return the remainder of dividing x by y, all rounded to the passed number of decimals (if any)

Parameters:
  • x (float | np.ndarray) – dividend.

  • y (float | np.ndarray) – divisor.

  • num_decimals (int) – number of decimals to round to. Default is 100.

Warning

If x and y are arrays, they must be conformable. That is, they must either be of the same size or one of them must be of size 1. If x and y are arrays of different sizes, the function will raise a TypeError.

Return type:

np.ndarray

ncr(n, r)[source]#

Calculate combinations efficiently \(nCr = \frac{n}{ r! \times (n-r)! }\).

Parameters:
  • n (int) – non-negative integer

  • r (int) – non-negative integer

Returns:

\(nCr\)

Return type:

int

enumerate_binary(size, num_active=None)[source]#

Return a list of numpy arrays of all possible binary arrays of given size.

Parameters:
  • size (int) – the size of binary arrays to enumerate.

  • num_active (int | None) – number of active entries in the binary arrays. If None, all possible binary arrays of size size are returned.

sample_binary_states(size, num_samples, num_active=None, random_seed_or_rng=None)[source]#

Uniformly sample random binary states of given size and number of active entries.

Parameters:
  • size (int) – dimension of the binary state.

  • num_samples (int) – number of binary states to sample.

  • num_active (int | None) – number of active entries in the binary states. If None, then no restriction is placed on the number of active entries.

  • random_seed_or_rng (int | Generator | None) – random seed or random number generator to use for sampling. Default is None.

Returns:

a 2D numpy array of shape (num_samples, size) containing the sampled binary states.

Return type:

ndarray

index_to_binary_state(k, size, dtype=<class 'bool'>)[source]#

Return the binary state=:math:(v_1, v_2, dots) of dimension=size, with index k, where the index k is defined by the unique mapping:

\[k = 1 + \sum_{i=1}^{size} v_i 2^{i-1}\]
Parameters:
  • k (int) – index of the binary state.

  • size (int) – dimension of the binary state.

  • dtype – data type of the returned numpy array. Default is bool.

Return type:

np.ndarray

index_from_binary_state(state)[source]#

Reverse of “index_to_binary_state” Return the index k corresponding to the passed state (of dimension=size), where the index k is defined by the unique mapping:

\[k = 1 + \sum_{i=1}^{size} v_i 2^{i-1}\]
Parameters:

state (np.ndarray) – binary state of dimension=size.

Raises:

ValueError if the passed state contains inf or nan values

Note

Entries of the passed state are interpreted as True/False by conversion to bool

Return type:

int

threshold_relaxed_design(design, method='vanishing', num_active=1, vanishing_threshold=0.001, shuffle_extra_max=False, random_seed=None, candidates=None)[source]#

Given a non-binary state/design, find the winner entries based on the passed method, and convert the passed design/state to a binary state. This is useful to OED applications where a relaxed design is usually rounded-off to a binary design.

If num_active is 2, and method is ‘max’, the maximum two design weights are selected as winners return the number of winners, their indexes in the design, and their coordinates (if coordinates) are passed. The number of winners returned <= num_active based on the availability in the design.

For example, if no sufficient entries win from the design, e.g. very sparse, the maximum number of positive values is picked.

Parameters:
  • design – a one dimensional array-like iterable to be thresholded

  • method (str) –

    The following rules are followed based on the thresholding method:

    • ’vanishing’: any sensor with design value above (or equal) the vanishing_threshold is activated; num_active is ignored.

    • ’median’: any sensor above (or equal) the median value is activated; num_active and vanishing_threshold are ignored.

    • ’max’:

      • If shuffle_extra_max is True, they maximum num_active sensors are returned; if there are multiple entries with equal values, extra sensors are shuffled

      • If shuffle_extra_max is False, design entries are ranked (in decending order), and any entry equal to or exceeds the maximum num_active entry is activated.

  • num_active (int) – number of sensors to activate (if method is ‘max’)

  • vanishing_threshold (float) – weight point over which sensor is activated; used if method is ‘vanishing’

  • shuffle_extra_max (bool) – used only for method ‘max’; see above

  • random_seed (None|int) – used to initialize the random number generator to do shuffling if shuffle_extra_max is True; useful for reproducibility

  • candidates – 2d array with coordinates of the observation gridpoints with each,

Returns:

  • num: number of active sensors/entries

  • indexes: indexes of the design vector that are activated

  • coords: The coordinate entries from candidates that are activated by thresholding; None if candidates is None

group_by_num_active(results_dict, design_size)[source]#

Given a dictionary (results_dict), with keys representing binary index, group the values by the size of the design corresponding to each design key

Parameters:
  • results_dict (dict[int, float]) – dictionary of results

  • design_size (int) – size of the design

Raises:

TypeError – if results_dict is not a dictionary