Three Dimensional Variation (3DVar) DA#

This module implements several variants of the three-dimensional variational (3D-Var) data assimilation scheme. 3D-Var refers to simulation/prediction in three-dimensional spatial coordinates. The name 3D-Var is popular in the atmospheric weather prediction (NWP) literature. This algorithm is equivalent to standard regularized least-squares inversion. Single observation (for both time independent or time dependent models) is used to update the model parameter

class VanillaThreeDVarConfigs(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', name='3D-Var: Three Dimensional Variational Data Assimilation', model=None, prior=None, observation_error_model=None, observation_operator=None, observations=None, optimizer=<class 'pyoed.optimization.scipy_optimization.ScipyOptimizer'>, optimizer_configs=<class 'pyoed.optimization.scipy_optimization.ScipyOptimizerConfigs'>, invert_for='parameter')[source]#

Bases: VariationalFilterConfigs

Configurations class for the VanillaThreeDVar abstract base class.

Parameters:
  • verbose (bool) – a boolean flag to control verbosity of the object.

  • debug (bool) – a boolean flag that enables adding extra functionality in a debug mode

  • output_dir (str | Path) – the base directory where the output files will be saved.

  • name (str | None) – name of the DA (inverse problem solver) approach/method.

  • model (None | SimulationModel) – the simulation model.

  • prior (None | ErrorModel) – Background/Prior model (e.g., GaussianErrorModel)

  • observation_operator (None | ObservationOperator) – operator to map model state to observation space

  • observation_error_model (None | ErrorModel) – Observation error model (e.g., GaussianErrorModel)

  • observations (None | Any) – Observational data (the data type is very much dependent of the DA method)

  • optimizer (None | Optimizer) –

    the optimization routine (optimizer) to be registered and later used for solving the OED problem. This can be one of the following:

    • None: In this case, no optimizer is registered, and the solve() won’t be functional until an optimization routine is registered.

    • An optimizer instance (object that inherits :py:class`Optimizer`). In this case, the optimizer is registered as is and is updated with the passed configurations if available.

    • The class (subclass of Optimizer) to be used to instantiate the optimizer.

  • optimizer_configs (None | OptimizerConfigs) –

    the configurations of the optimization routine. This can be one of the following:

    • None, in this case configurations are discarded, and whatever default configurations of the selected/passed optimizer are employed.

    • A dict holding full/partial configurations of the selected optimizer. These are either used to instantiate or update the optimizer configurations based on the type of the passed optimizer.

    • A class providing implementations of the configurations (this must be a subclass of OptimizerConfigs.

    • An instance of a subclass of OptimizerConfigs which is to set/udpate optimizer configurations.

    Note

    Not all DA (inverse problem) objects are optimization-based. For example, particle-based (EnKF, PF, etc.) employ a sample to estimate the flow of the distribution through the model dynamics (prior -> posterior). Thus, the optimizer (and configs) in this case (the default) are set to None. For optimization-based methods such as 3DVar, 4DVar, etc., an optimizer must be registered for the inverse problem to be solved.

invert_for: str#
__init__(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', name='3D-Var: Three Dimensional Variational Data Assimilation', model=None, prior=None, observation_error_model=None, observation_operator=None, observations=None, optimizer=<class 'pyoed.optimization.scipy_optimization.ScipyOptimizer'>, optimizer_configs=<class 'pyoed.optimization.scipy_optimization.ScipyOptimizerConfigs'>, invert_for='parameter')#
class VanillaThreeDVar(configs=None)[source]#

Bases: VariationalFilter

A class implementing the vanilla 3D-Var DA scheme to invert for model state or parameter, assuming the model equation is on the form \(\mathbf{x}=\mathcal{M} \mathbf{p}\), and \(\mathbf{p},\mathbf{x}\) are the model parameter and state respectively. Standard 3D-Var inversion retrieves model state from observations for time-independent models. This class allows inversion for model state or model parameter respectively This can be decided by upon inistantiation using the configuration key invert_for.

Similar to pyoed.assimilation.smoothing.fourDVar.VanillaFourDVar, this version is very elementary and is developed as guidelines to create more advanced versions, e.g., parameter-retrieval, or extended (parameter-state) retrieval for inifinite dimensional formulations; see remrks below for assumptions about the dynamical model.

This scheme assumes Gaussian observational noise. The model unknown parameter is regularized (to avoid overfitting to noise) by assuming a prior (usually a Gaussian), or by specifying a regularization matrix and following an \(\ell_2\) regularization approach.

You can either pass the 3D-Var elements upon initialization, or later using the proper register methods. However, you can’t use solve() before registering all proper elements (simulation model, prior, observations/data, observation operator, and linear and optimization solvers)

Parameters:

configs (dict) –

a dictionary containing configurations of the 3D-Var scheme. You can see default configurations by calling:

`VanillaThreeDVar.get_default_configurations()`

Default Configurations include:

  • model: simulation model. See the remarks for assumptions made about the dynamical/simulation model instance.

  • invert_for: the objective of solving the inverse problem; this could be ‘state’ or ‘parameter’. In the former case, the dynamical model is not needed, while in the latter, a model must be provided with a method Jacobian_T_matvec() provided by the model instance.

  • prior: Background/Prior model (e.g., GaussianErrorModel)

  • observation_operator: operator to map model state to observation space

  • observation_error_model: Observation error model (e.g., GaussianErrorModel)

  • optimization_routine: string describing the package and the algorithm to use for optimizatio

    Note

    Call the staticmethod VanillaThreeDVar.show_supported_optimizers() for a list of optimizers implemented

  • output: a dictionary controlling screen/file output configurations (control model verbosity):

    • output_dir: Path to folder (will be created if doesn’t exist) to which reports/output, etc., are written.

Remarks:
  • The 3D-Var scheme assumes the dynamical model and the observational operator

    are both specified. In other formulations, these two operators are augmented in a single operator \(\mathcal{F}\) which is called the forward operator (aka parameter-to-observable map). The dunamical model maps the moel parameter to the state by forward solution

  • In this version of 3D-Var, we assume the model state is on the Discretized form \(y = M(x)\), where \(M\) is the discretized forward model that projects the model parameter \(x\) onto the state space, that is \(M\) is a parameter-to-state map. A requirement here is that the model provides an implementation of a method Jacobian_T_matvec which multiplies the tangent linear model (TLM) i.e., the derivative of \(M\) with respect to the parameter , by a given parameter vector.

__init__(configs=None)[source]#
register_optimizer(optimizer=<class 'pyoed.optimization.scipy_optimization.ScipyOptimizer'>, optimizer_configs=None)[source]#

Register (and return) an optimization routine, and make sure the objective function in the optimization routine is set to the objctive function of this objective objective_function_value(). The objective function of the optimizer (and its derivatives) are set to the objective function and derivatives implemented here unless the objective function is passed explicitly in the configurations. This method sets a reference to the optimizer and update the underlying configurations accordingly.

Note

If the optimizer passed is an instance of (derived from) Optimizer, and valid configurations optimizer_configs is passed, the optimizer is updated with the passed configurations by calling optimizer.update_configurations(optimizer_configs).

Warning

If the optimizer passed is an instance of (derived from) Optimizer, the responsibility is on the developer/user to validate the contents of the optimizer.

Parameters:
  • optimizer (str | Optimizer | Type[Optimizer]) –

    the optimization routine (optimizer) to register. This can be one of the following:

    • An optimizer instance (object that inherits :py:class`Optimizer`). In this case, the optimizer is registered as is and is updated with the passed configurations if available.

    • The class (subclass of Optimizer) to be used to instantiate the optimizer. To see available built-in optimization routines, see/call show_supported_optimizers().

    • The name of the optimizer (str). This has to match the name of one of the available optimization routine. To see available built-in optimization routines, see/call show_supported_optimizers().

  • optimizer_configs (None | dict | OptimizerConfigs | Type[OptimizerConfigs]) –

    the configurations of the optimization routine. This can be one of the following:

    • None, in this case configurations are discarded, and whatever default configurations of the selected/passed optimizer are employed.

    • A dict holding full/partial configurations of the selected optimizer. These are either used to instantiate or update the optimizer configurations based on the type of the passed optimizer.

    • A class providing implementations of the configurations (this must be a subclass of OptimizerConfigs.

    • An instance of a subclass of OptimizerConfigs which is to set/udpate optimizer configurations.

Returns:

the registered optimizer.

Raises:

TypeError – if the type of passed optimizer and/or configurations are/is not supported

Return type:

Optimizer

validate_configurations(configs, raise_for_invalid=True)[source]#

Validate passed configurations.

Parameters:

configs (dict | VanillaThreeDVarConfigs) – configurations to validate. If a VanillaThreeDVarConfigs object is passed, validation is performed on the entire set of configurations. However, if a dictionary is passed, validation is performed only on the configurations corresponding to the keys in the dictionary.

update_configurations(**kwargs)[source]#

Take any set of keyword arguments, and lookup each in the configurations, and update as nessesary/possible/valid

Raises:

TypeError – if any of the passed keys in kwargs is invalid/unrecognized

Remarks:
  • Generally, we don’t want actual implementations in abstract classes, however, this one is provided as good guidance. Derived classes can rewrite it and/or provide additional updates.

register_model(model=None)[source]#

Register (and return) the simulation model to be registered. This calls InverseProblem.register_model and adds extra assertions/functionality specific for filters.

Raises:
  • TypeError – if the type of passed model is not supported

  • TypeError – if the model does not provide Jacobian_T_matvec method

register_prior(prior=None, initiate_posterior=True)[source]#

Update prior/regularization information. Since, changing the prior implies a different posterior, the argument initiate_posterior is added with defaul value True to make sure the posterio is overwritten/reset

Raises:

TypeError is raised if:

  • the passed prior type is not supported

  • a valid simulation model has not been registered yet

  • the prior is not compatible with the registered simulation model

Note

The prior here is asserted on the model state or model parmeter based on the invert_for flag selected in the configurations dictionary

register_observation_operator(observation_operator)[source]#

The observation operator is a function/map that takes as input a model state, and returns an observation instance.

Parameters:

observation_operator (None | ObservationOperator) – the obsevation operator

Raises:
  • TypeError – if observation_operator is not derived from ObservationOperator

  • TypeError – if the passed operator doesn’t provide a function to apply the tangent linear of the observation operator ‘Jacobian_T_matvec’

  • TypeError – if the model state does not match the observation operator domain size

Note

We are assuming the observation operator is time-independent, however, it provides a method apply and/or __call__ which converts a model state into an equivalent observation

objective_function_value(init_guess, data_misfit_only=False)[source]#

Evaluate the value of the 3D-Var objective function (data misfit + prior/regularization) given the passed init_guess as an estimate of the true model parameter/state

Parameters:
  • init_guess – an estimate of the true model parameter/state

  • data_misfit_only (bool) – discard the prior/regularization term if True. This is added for flexibility

Returns:

value of the 3D-Var objective function.

Raises:
  • TypeError – if init_guess is not a valid model parameter/state bsed on the inverstion objective selected upon instantiation

  • TypeError – if any of the necessary components (prior, observation, error models, and observation operator) is/are not registered

objective_function_gradient(init_guess, data_misfit_only=False)[source]#

Evaluate the gradient of the 3D-Var objective function given the passed init_guess as an estimate of the true model parameter/state

Parameters:
  • init_guess – an estimate of the model initial parameter/state

  • data_misfit_only (bool) – discard the prior/regularization term if True. This is added for flexibility

Returns:

the gradient of the 3D-Var cost function

apply_forward_operator(init_guess, scale_by_noise=True)[source]#

Apply \(F\), the forward operator to the passed parameter/state x. The forward operator \(F\) here, refers to the simulation model followed by the observation operator. The result is a data point (an observation)

Parameters:
  • init_guess – data structure holding the model parameter (for parameter inversion) or state (for state inversion). In the former, a forward model simulation is applied followed by the observation operator. In the latter, only observation operator is applied

  • scale_by_noise (bool) – if True, the observations are scaled by the observation error precision (inverse of the covariance matix)

Returns:

the observations result from forward solving the forward model equations, and applying observation operator (with/without) scaling by observation error precisions

apply_forward_operator_adjoint(obs, eval_at=None, scale_by_noise=True)[source]#

Apply F^*, the adjoint of the forward operator, to the passed observation. Here, F is a composition of the simulation model and the observation operator (e.g., simulation followed by restriction).

Parameters:
  • obs – the vector in observation space to apply the adjoint to

  • scale_by_noise (bool) – if True, the observations are scaled by the observation error precision

  • eval_at – parameter/state to linearize the forward operator around,

Returns:

the adjoint

full_Hessian(eval_at=None, data_misfit_only=False)[source]#

Construct the Hessian by evaluating the tangent linear model This method works by wrapping the forward and the adjoint propagators with the covariance matrices.

Hessian_matvec(vec, eval_at=None, data_misfit_only=False)[source]#

Return the product of the Hessian of 3D-Var objective with a vector ‘vec’

Parameters:
  • vec – a vector (state/parameter) to multiply the 3D-Var objective Hessian by

  • eval_at – parameter/state at which the Hessian is evaluated (passed to the model adjoint and/or observtion operator to linearize around)

  • data_misfit_only (bool) – if True the prior term of the Hessian is discarded, and only the data-misfit term is evaluated

Returns:

product of the inverse of the posterior covariance (of the linearized problem) by a vector

posterior_covariance(map_point=None, return_array=True)[source]#

Create a linear operator or a numpy array representing the posterior covariance (Gaussian/Laplace/approximation around the MAP point estimate). If the map_point is not passed, the registered posterior map point is used (if the inverse problem has been already solved), otherwise, the prior mean is used. We advise to explicitly pass a valid map_point. Note that if the problem is linear, the map_point has no effect.

Parameters:
  • map_point – solution of the 4D-Var inverse problem;

  • return_array (bool) – if True construct the posterior covariance matrix (as a numpy dense array), otherwise, return a scipy LinearOperator with the posterior covariance vector product as the underlying matvec function.

Returns:

posterior covariance matrix/operator

Hessian_inv_matvec(vec, eval_at=None, data_misfit_only=False, precond=None, solver='scipy-minres', solver_options={'callback': None, 'maxiter': None})[source]#

Return the product of the Hessian inverse of 4D-Var objective with a vector ‘state/parameter’

Parameters:
  • vec – a vector (state/parameter) to multiply the 3D-Var objective Hessian inverse by

  • eval_at – parameter/state at which the Hessian is evaluated (passed to the model adjoint and/or observtion operator to linearize around)

  • data_misfit_only (bool) – if True the prior term of the Hessian is discarded, and only the data-misfit term is evaluated

  • precond – if passed, it is aexpected to be a function on the form precond(x, p) that accepts a state x and a parameter p

Returns:

product of the posterior covariance (of the linearized problem) with a state vector

Remarks:

this function is very important and resembles the most epensive part of linear/linearized Bayesian algorithms. Preconditioning, and/or reduced-order approxmiations can be very important for performance here.

Note

This method works by solving a linear system with Hessian_matvec. One could just compute the Hessian (in full) and invert it if the dimension of the problem (state/parameter) is really small.

solve(init_guess=None, skip_map_estimate=False, update_posterior=False)[source]#

Solve the inverse problem, i.e., find the model state/parameter given the registered observation and prior information

Parameters:
  • init_guess – initial guess of the model state/parameter

  • skip_map_estimate (bool) – use the prior mean as a map estimte (ignore observations) and upate posterior. This useful for OED when the model is linear

  • update_posterior (bool) – if True, the posterior mean and covariance operator are updated (given/around the solution of the inverse problem)

Returns:

the analysis state/parameter (MAP estimate of the posterior)

Raises:
  • TypeError if any of the 3D-Var elements (model, observation operator, prior, solver, or data) is missing

  • ValueError if the underlying optimization routine is not supported yet

  • ValueError if the ‘invert_for’ is not recognized; only ‘state’ or ‘parmeter’ are accepted.

  • AttributeError is raised if the passed model doesn not provide a method ‘Jacobian_T_matvec’ to apply the tangent linear model (TLM/Jacobian) transpose of the right-hand-side of the discretized PDE to a model parameter.

show_registered_elements(display=True)[source]#

Compose and (optionally) print out a message containing the elements of the inverse problem and show what’s registered and what’s not

Parameters:

display – if True print out the composed message about registered/unregistered elements

Returns:

the composed message

Return type:

None

check_registered_elements()[source]#

Check if all inversion elements are registered or not

property invert_for#

The objective of solving the inverse problem (this can be decided ONLY upon initistantiation)

property invert_for_parameter#

Flag that is True if the inversion is carried out for model parameter

property supported_optimizers#
class DolfinThreeDVarConfigs(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', name='3D-Var: Three Dimensional Variational Data Assimilation', model=None, prior=None, observation_error_model=None, observation_operator=None, observations=None, optimizer=<class 'pyoed.optimization.scipy_optimization.ScipyOptimizer'>, optimizer_configs=<class 'pyoed.optimization.scipy_optimization.ScipyOptimizerConfigs'>, invert_for='parameter')[source]#

Bases: VanillaThreeDVarConfigs

Configurations class for the VanillaThreeDVar abstract base class.

Parameters:
  • verbose (bool) – a boolean flag to control verbosity of the object.

  • debug (bool) – a boolean flag that enables adding extra functionality in a debug mode

  • output_dir (str | Path) – the base directory where the output files will be saved.

  • name (str | None) – name of the DA (inverse problem solver) approach/method.

  • model (None | SimulationModel) – the simulation model.

  • prior (None | ErrorModel) – Background/Prior model (e.g., GaussianErrorModel)

  • observation_operator (None | ObservationOperator) – operator to map model state to observation space

  • observation_error_model (None | ErrorModel) – Observation error model (e.g., GaussianErrorModel)

  • observations (None | Any) – Observational data (the data type is very much dependent of the DA method)

  • optimizer (None | Optimizer) –

    the optimization routine (optimizer) to be registered and later used for solving the OED problem. This can be one of the following:

    • None: In this case, no optimizer is registered, and the solve() won’t be functional until an optimization routine is registered.

    • An optimizer instance (object that inherits :py:class`Optimizer`). In this case, the optimizer is registered as is and is updated with the passed configurations if available.

    • The class (subclass of Optimizer) to be used to instantiate the optimizer.

  • optimizer_configs (None | OptimizerConfigs) –

    the configurations of the optimization routine. This can be one of the following:

    • None, in this case configurations are discarded, and whatever default configurations of the selected/passed optimizer are employed.

    • A dict holding full/partial configurations of the selected optimizer. These are either used to instantiate or update the optimizer configurations based on the type of the passed optimizer.

    • A class providing implementations of the configurations (this must be a subclass of OptimizerConfigs.

    • An instance of a subclass of OptimizerConfigs which is to set/udpate optimizer configurations.

    Note

    Not all DA (inverse problem) objects are optimization-based. For example, particle-based (EnKF, PF, etc.) employ a sample to estimate the flow of the distribution through the model dynamics (prior -> posterior). Thus, the optimizer (and configs) in this case (the default) are set to None. For optimization-based methods such as 3DVar, 4DVar, etc., an optimizer must be registered for the inverse problem to be solved.

__init__(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', name='3D-Var: Three Dimensional Variational Data Assimilation', model=None, prior=None, observation_error_model=None, observation_operator=None, observations=None, optimizer=<class 'pyoed.optimization.scipy_optimization.ScipyOptimizer'>, optimizer_configs=<class 'pyoed.optimization.scipy_optimization.ScipyOptimizerConfigs'>, invert_for='parameter')#
class DolfinThreeDVar(configs=None)[source]#

Bases: VanillaThreeDVar

Class for solving 3D-Var inverse problems with simulation/forward model equations on the form \(F(u, m) = 0\), where \(u\) is the model state, and \(m\) is the model parameter, respectively.

This formulation is adopted by many simulation packages and backends, e.g., PETSc, however, this implementation was originally developed for models implemented in Fenics/Dolfin, hence the name. We may consider renaming with the package being expanded.

Similar to VanillaThreeDVar, this class allows inversion for model state or model parameter respectively. This can be decided by upon inistantiation using the configuration key invert_for. This class inherits most functionality from VanillaThreeDVar. The main difference is the evaluation of the gradient of the 3D-Var cost functional, and the application of the forward operator adjoint.

This scheme assumes Gaussian observational noise. The model unknown parameter is regularized (to avoid overfitting to noise) by assuming a prior (usually a Gaussian), or by specifying a regularization matrix and following an \(\ell_2-\) regularization approach.

You can either pass the 3D-Var elements upon initialization, or later using the proper register methods. However, you can’t use solve() before registering all proper elements (simulation model, prior, observations/data, observation operator, and linear and optimization solvers)

Parameters:

configs (dict) –

a dictionary containing configurations of the 3D-Var scheme. You can see default configurations by calling:

`DolfinThreeDVar.get_default_configurations()`

Default Configurations include:

  • model: simulation model. See the remarks for assumptions made about the dynamical/simulation model instance.

  • invert_for: the objective of solving the inverse problem; this could be ‘state’ or ‘parameter’. In the former case, the dynamical model is utilized only to validate states/parameter, however, in the latter the dynamical/simulation model is used to access a method to calculate forward/adjoint sensitivities.

  • prior: Background/Prior model (e.g., GaussianErrorModel)

  • observation_operator: operator to map model state to observation space

  • observation_error_model: Observation error model (e.g., GaussianErrorModel)

  • optimization_routine: string describing the package and the algorithm to use for optimization

    Note

    Call the staticmethod VanillaThreeDVar.show_supported_optimizers() for a list of optimizers implemented

  • output: a dictionary controlling screen/file output configurations (control model verbosity)

    • output_dir: Path to folder (will be created if doesn’t exist) to which reports/output, etc., are written.

Remarks:
  • The 3D-Var scheme assumes the dynamical model and the observational operator

    are both specified. In other formulations, these two operators are augmented in a single operator \(\mathcal{F}\) which is called the forward operator (aka parameter-to-observable map). The dunamical model maps the moel parameter to the state by forward solution

register_model(model=None)[source]#

Register (and return) the simulation model to be registered. This calls InverseProblem.register_model and adds extra assertions/functionality specific for filters.

Raises:
  • TypeError – if the type of passed model is not supported

  • TypeError – if the model does not provide solve_adjoint method

objective_function_gradient(init_guess, data_misfit_only=False)[source]#

Evaluate the gradient of the 3D-Var objective function given the passed init_guess as an estimate of the true model parameter/state

Parameters:
  • init_guess – an estimate of the model initial parameter/state

  • data_misfit_only (bool) – discard the prior/regularization term if True. This is added for flexibility

Returns:

the gradient of the 3D-Var cost function

apply_forward_operator_adjoint(obs, eval_at=None, scale_by_noise=True)[source]#

Apply F^*, the adjoint of the forward operator, to the passed observation. Here, F is a composition of the simulation model and the observation operator (e.g., simulation followed by restriction).

Parameters:
  • obs – the vector in observation space to apply the adjoint to

  • scale_by_noise (bool) – if True, the observations are scaled by the observation error precision

  • eval_at – parameter/state to linearize the forward operator around,

Returns:

the adjoint

Hessian_matvec(vec, eval_at=None, data_misfit_only=False)[source]#

Return the product of the Hessian of 3D-Var objective with a vector ‘vec’

Parameters:
  • vec – a vector (state/parameter) to multiply the 3D-Var objective Hessian by

  • eval_at – parameter/state at which the Hessian is evaluated (passed to the model adjoint and/or observtion operator to linearize around)

  • data_misfit_only (bool) – if True the prior term of the Hessian is discarded, and only the data-misfit term is evaluated

Returns:

product of the inverse of the posterior covariance (of the linearized problem) by a vector

Notes:

the implementation here is incorrect if the model is nonlinear; we need TLM and proper reformulation as done in VanillaThreeDVar