Math Utility Functions#
This module provides access to functions to perform various linear algebra, calculus etc. operations.
- validate_inverse_problem_derivatives(problem, eval_at, gradient=None, validate_gradient=True, validate_Hessian=True, data_misfit_only=False, method='fd', fd_eps=None, use_noise=True, create_plots=False, plots_format='pdf', random_seed=1011, output_dir=None)[source]#
Given a problem (filtering/smoothing instance), use finite differences to validate the gradient and/or the Hessian of the objective function associated with the passed problem; the problem must provide the following methods:
objective_function_value()
to calculate the objective function value and the gradient at a given state/parameter;objective_function_gradient()
to calculate the first-order derivative (gradient) of the objective function at a given state/parameter eval_at;Hessian_matvec()
to evaluate the product of the second-order derivative (Hessian) of the objective function at a given state/parameter eval_at;
- Parameters:
problem – problem (filtering/smoothing)
eval_at – state/parameter to evaluate derivatives at
validate_gradient (bool) – apply validation for first order-derivative
validate_Hessian (bool) – apply validation for first order-derivative
data_misfit_only (bool) – ignore any prior term in the objective function if True
method (str) –
the method used to validate derivatives. Supported methods are
’fd’: standard (right-sided) finite-difference validation
’fd-central’: double-sided finite-difference validation
fd_eps –
epsilon used for finite difference validation; automatically chosen if None
if use_noise is True, a range from 1e-12 to 1 is sampled
if use_noise is False, 1e-7 is used
use_noise (bool) – use white noise (multiplied by derivatives) to validate gradients and create log-log plots (if create_plots is True)
output_dir – path to folder/location to save plots (if create_plots ` is `True)
random_seed – random seed to feed random number generator for reproducibility
- Returns:
relative errors w.r.t FD approximation; size of returned items vary based the used approach (use noise vs. not) - grad_err: relative error of the gradient comapred to finite differences - Hessian_err: relative error of Hessian matvec effect comapred to finite differences
- Raises:
ValueError is raised if both validate_gradient and validate_Hessian are set to False
ValueError is raised if method is not supported/recognized
TypeError is raised if the type of fd_eps is not supported
- inverse_problem_linearity_test(inverse_problem)[source]#
Check if the passed inverse problem is linear not by testing the forward model and the observation operator. The inverse problem is tested for linearity by testing linearity of:
The observation operator: each observation operator must provide
Jacobian_T_matvec()
method which requires valid (not None) value of eval_at argument. If it is possible to evaluate this method without eval_at then the observation operator is linear, otherwise it throws a ValueError which indicates the operator is nonlinear.The simulation model: similar to the observation operator, but the model linearity is test through one of
Jacobian_T_matvec()
orsolve_adjoint()
based one which one is available.
If none of the methods above is available where expected, this method throws a
TypeError
- Returns:
a flag indicating whether the underlying inverse/OED problem is linear.
- Raises:
TypeError
if the methods to be tested are not available
- validate_function_gradient(fun, state, gradient, fd_eps=1e-08, bounds=None, verbose=False, fd_central=False, fd_complex=False)[source]#
Validate the gradient of an objective function fun using finite differences. Complex finite differences are supported, and utilized if fd_complex is True
- Parameters:
fun (Callable) – The objective function to validate its gradient
state – The state/parameter at which to validate the gradient
gradient – The gradient to validate
fd_eps (float) – The finite difference perturbation. Default is 1e-8.
bounds – either None, or a list of 2-sized iterables on the form [(lb, ub), (lb, ub), ….] defining lower bounds lb and upper bounds ub of each entry of hte state. The length of bounds must be equal to the size of state.
verbose (bool) – If True, print validation statistics.
fd_central (bool) – If True, use central finite differences. Default is False, and therefore standard (right-sided) finite differences are used.
fd_complex (bool) – If True, use complex finite differences. Default is False.
Warning
Complex finite differences are not yet implemented!
- Raises:
TypeError – If fun is not callable.
AssertionError – If state and gradient are not of equal sizes.
AssertionError – If fd_eps is not a real number.
AssertionError – If fd_eps is zero.
NotImplementedError – If fd_complex is True, as complex finite differences are not implemented yet.
ValueError – If the finite difference throws an exception for some reason.
- Returns:
A tuple containing a numpy array of the finite difference gradient of fun at state and its corresponding gradient error.
- Return type:
tuple[np.ndarray, np.ndarray]
- finite_differences_gradient(fun, state, fd_eps=1e-08, bounds=None, verbose=False, fd_central=False, fd_complex=False)[source]#
Evaluate the gradient of an objective function fun using finite differences. Complex finite differences are supported, and utilized if fd_complex is True
- Parameters:
fun (Callable) – The objective function to validate its gradient
state – The state/parameter at which to validate the gradient
fd_eps (float) – The finite difference perturbation. Default is 1e-8.
bounds – either None, or a list of 2-sized iterables on the form [(lb, ub), (lb, ub), ….] defining lower bounds lb and upper bounds ub of each entry of hte state. The length of bounds must be equal to the size of state.
verbose (bool) – If True, print validation statistics.
fd_central (bool) – If True, use central finite differences. Default is False, and therefore standard (right-sided) finite differences are used.
fd_complex (bool) – If True, use complex finite differences. Default is False.
Warning
Complex finite differences are not yet implemented!
- Raises:
TypeError – If fun is not callable.
AssertionError – If fd_eps is not a real number.
AssertionError – If fd_eps is zero.
NotImplementedError – If fd_complex is True, as complex finite differences are not implemented yet.
TypeError – If bounds are passed (not None) but have wrong shape/format
ValueError – If the finite difference throws an exception for some reason.
- Returns:
A numpy array of the finite difference gradient of fun at state
- Return type:
np.ndarray
- asarray(A, shape=None, dtype=None)[source]#
Return a copy of the passed array-like object as a numpy array. This requires the input to by scalar, numpy array, scipy sparse matrix, or a callable/linear operator.
- Parameters:
A (float | np.ndarray | sp.spmatrix | abc.Callable[np.ndarray]) – The array-like object to be converted to a numpy array.
shape – The shape of the array to be returned. This is only used if A is a callable/linear operator.
- Raises:
TypeError – If A is not a scalar, numpy array, scipy sparse matrix, or a collable/linear operator.
ValueError – If A is a callable/linear operator, and shape is not given.
- Return type:
np.ndarray
- matrix_trace(A, size=None, method='exact', randomization_vectors=None, return_randomization_vectors=False, sample_size=100, optimize_sample_size=False, min_sample_size=10, max_sample_size=100, distribution='Rademacher', random_seed=None, accuracy=0.01)[source]#
Evaluate/approximate the trace of a given matrix or operator
- Parameters:
A (ndarray | spmatrix | Callable[[ndarray], ndarray]) – Linear operator to compute trace of.
size (int | None) – size of the domain of A; required if A is callable.
method (str) – Method of evaluation. If “exact”, we simply sum the diagonal of A. If “randomized”, a Hutchinson style trace estimator, as described in
pyoed.utility.trace_estimator()
, is employed. All of the futher keyword arguments are in the case of method=”randomized” and are otherwise ignored if method=”exact”.randomization_vectors (list[ndarray] | None) – The randomization vectors drawn to perform the computation. If not passed, they will be sampled.
sample_size (int) – The number of samples to use. :param optimize_sample_size: if True, sample_size is replaced with an optimized version, where the optimal sample size minimizes the variance of the estimator and is between min_sample_size, and max_sample_size.
min_sample_size (int) – Minimum number of random samples.
max_sample_size (int) – Maximum number of random samples.
distribution (str) – The probability distribution used for random sampling. Both ‘Rademacher’ and ‘Gaussian’ are supported.
random_seed – The random seed to use for the distribution samples.
accuracy (float) – convergence criterion.
- Returns:
exact/approximate value of the trace of A. If return_randomization_vectors is True it also, additionally, returns the samples drawn during the randomized method.
- Return type:
float | tuple[float, list[ndarray]]
- matrix_logdet(A, size=None, method='exact', randomization_vectors=None, return_randomization_vectors=False, sample_size=100, optimize_sample_size=False, min_sample_size=10, max_sample_size=100, distribution='Rademacher', random_seed=None, accuracy=0.01, sigma_bounds=[None, None], n_deg=14)[source]#
Evaluate/approximate the log-det of a given matrix or operator
- Parameters:
A – Linear operator to compute logdet of.
size – size of the domain of A; required if A is callable.
method – Method of evaluation. If “exact”, we simply sum the diagonal of A. If “randomized”, a chebyshev-approximation type-method is employed from
pyoed.utility.logdet_estimator()
. All of the futher keyword arguments are in the case of method=”randomized” and are otherwise ignored if method=”exact”.randomization_vectors (list[ndarray] | None) – The randomization vectors drawn to perform the computation. If not passed, they will be sampled.
sample_size (int) – The number of samples to use. :param optimize_sample_size: if True, sample_size is replaced with an optimized version, where the optimal sample size minimizes the variance of the estimator and is between min_sample_size, and max_sample_size.
min_sample_size (int) – Minimum number of random samples.
max_sample_size (int) – Maximum number of random samples.
distribution (str) – The probability distribution used for random sampling. Both ‘Rademacher’ and ‘Gaussian’ are supported.
random_seed – The random seed to use for the distribution samples.
accuracy (float) – convergence criterion.
- Returns:
exact/approximate value of the logdet of the covariance matrix of the underlying model. If return_randomization_vectors is True, instead returns tuple of the form (logdet, randomization_vectors)
- Return type:
float | tuple[float, list[ndarray]]
- matrix_diagonal(A, size=None)[source]#
Return a copy the diagonal of a given matrix, i.e., the variances for a covariance matrix.
- Parameters:
A (np.ndarray | sp.spmatrix | abc.Callable[[np.ndarray],np.ndarray]) – Matrix to compute diagonal of.
size (int) – shape of A; required if A is callable. Default is None.
- Raises:
ValueError – if A is callable and size is not passed.
- Returns:
np.ndarray holding the diagonal of the underlying matrix. If the size of the underlying distribution is 1, this returns a scalar (variance)
- Return type:
np.ndarray | float
- matrix_vecmult(A, x, in_place=False)[source]#
Evaluate the product of a matrix by a vector x.
- Parameters:
A (np.ndarray | sp.spmatrix | abc.Callable[[np.ndarray],np.ndarray]) – Operator to apply to x.
sp.spmatrix (float | np.ndarray x |) – vector to act on.
in_place (bool) – overwrite the passed vector x (if True). If set to False, a new instance is created and returned. Default is False.
Note
in_place is ignored if x is a non-iterable, i.e., a scalar.
- Returns:
Ax
- Return type:
np.ndarray
- factorize_spsd_matrix(A, shape=None, lower=True)[source]#
Genereate Cholesky decomposition of the passed symmetric positive semidefinite matrix.
- Parameters:
A (float | np.ndarray | sp.spmatrix | abc.Callable[[np.ndarray],np.ndarray]) – matrix to factorize.
shape (tuple[int]) – shape of the matrix A. Necessary if type(A)==abc.Callable[np.ndarray]. Default is None.
lower (bool) – whether to return the lower triangular portion of the Cholesky factorization. Default is True.
- Returns:
Square root of A if scalar, otherwise, lower triangular portion of the matrix Cholesky decomposition. Return type is same as A type unless a callable is passed. In this case, a scipy linear operator is returned.
- Raises:
TypeError – if A is not of the specified types.
- truncate(x, lb=0, ub=1)[source]#
Return a numpy array representation of the passed object x with values truncated based on the lower bound lb and the upper bound ub. All values in x below lb are replaced with lb and all values above ub are replaced with ub
- Parameters:
x (float | np.ndarray) – object to truncate.
lb (float) – lower bound. Default is 0.
ub (float) – upper bound. Default is 1.
- Returns:
np.ndarray representation of x with truncated values.
- remainder(x, y, num_decimals=100)[source]#
Return the remainder of dividing x by y, all rounded to the passed number of decimals (if any)
- Parameters:
x (float | np.ndarray) – dividend.
y (float | np.ndarray) – divisor.
num_decimals (int) – number of decimals to round to. Default is 100.
Warning
If x and y are arrays, they must be conformable. That is, they must either be of the same size or one of them must be of size 1. If x and y are arrays of different sizes, the function will raise a TypeError.
- Return type:
np.ndarray
- ncr(n, r)[source]#
Calculate combinations efficiently \(nCr = \frac{n}{ r! \times (n-r)! }\).
- Parameters:
n (int) – non-negative integer
r (int) – non-negative integer
- Returns:
\(nCr\)
- Return type:
int
- enumerate_binary(size, num_active=None)[source]#
Return a list of numpy arrays of all possible binary arrays of given size.
- Parameters:
size (int) – the size of binary arrays to enumerate.
num_active (int | None) – number of active entries in the binary arrays. If None, all possible binary arrays of size size are returned.
- sample_binary_states(size, num_samples, num_active=None, random_seed_or_rng=None)[source]#
Uniformly sample random binary states of given size and number of active entries.
- Parameters:
size (int) – dimension of the binary state.
num_samples (int) – number of binary states to sample.
num_active (int | None) – number of active entries in the binary states. If None, then no restriction is placed on the number of active entries.
random_seed_or_rng (int | Generator | None) – random seed or random number generator to use for sampling. Default is None.
- Returns:
a 2D numpy array of shape (num_samples, size) containing the sampled binary states.
- Return type:
ndarray
- index_to_binary_state(k, size, dtype=<class 'bool'>)[source]#
Return the binary state=:math:(v_1, v_2, dots) of dimension=size, with index k, where the index k is defined by the unique mapping:
\[k = 1 + \sum_{i=1}^{size} v_i 2^{i-1}\]- Parameters:
k (int) – index of the binary state.
size (int) – dimension of the binary state.
dtype – data type of the returned numpy array. Default is bool.
- Return type:
np.ndarray
- index_from_binary_state(state)[source]#
Reverse of “index_to_binary_state” Return the index k corresponding to the passed state (of dimension=size), where the index k is defined by the unique mapping:
\[k = 1 + \sum_{i=1}^{size} v_i 2^{i-1}\]- Parameters:
state (np.ndarray) – binary state of dimension=size.
- Raises:
ValueError
if the passed state contains inf or nan values
Note
Entries of the passed state are interpreted as True/False by conversion to bool
- Return type:
int
- threshold_relaxed_design(design, method='vanishing', num_active=1, vanishing_threshold=0.001, shuffle_extra_max=False, random_seed=None, candidates=None)[source]#
Given a non-binary state/design, find the winner entries based on the passed method, and convert the passed design/state to a binary state. This is useful to OED applications where a relaxed design is usually rounded-off to a binary design.
If num_active is 2, and method is ‘max’, the maximum two design weights are selected as winners return the number of winners, their indexes in the design, and their coordinates (if coordinates) are passed. The number of winners returned <= num_active based on the availability in the design.
For example, if no sufficient entries win from the design, e.g. very sparse, the maximum number of positive values is picked.
- Parameters:
design – a one dimensional array-like iterable to be thresholded
method (str) –
The following rules are followed based on the thresholding method:
’vanishing’: any sensor with design value above (or equal) the vanishing_threshold is activated; num_active is ignored.
’median’: any sensor above (or equal) the median value is activated; num_active and vanishing_threshold are ignored.
’max’:
If shuffle_extra_max is True, they maximum num_active sensors are returned; if there are multiple entries with equal values, extra sensors are shuffled
If shuffle_extra_max is False, design entries are ranked (in decending order), and any entry equal to or exceeds the maximum num_active entry is activated.
num_active (int) – number of sensors to activate (if method is ‘max’)
vanishing_threshold (float) – weight point over which sensor is activated; used if method is ‘vanishing’
shuffle_extra_max (bool) – used only for method ‘max’; see above
random_seed (None|int) – used to initialize the random number generator to do shuffling if shuffle_extra_max is True; useful for reproducibility
candidates – 2d array with coordinates of the observation gridpoints with each,
- Returns:
num: number of active sensors/entries
indexes: indexes of the design vector that are activated
coords: The coordinate entries from candidates that are activated by thresholding; None if candidates is None
- group_by_num_active(results_dict, design_size)[source]#
Given a dictionary (results_dict), with keys representing binary index, group the values by the size of the design corresponding to each design key
- Parameters:
results_dict (dict[int, float]) – dictionary of results
design_size (int) – size of the design
- Raises:
TypeError – if results_dict is not a dictionary