Core (Base Classes) of Models#

Abstract base classes for simulation models.

This module defines the abstract interfaces that all simulation models in PyOED must implement. The hierarchy is:

Each model must provide methods for creating state/parameter vectors, solving the forward problem, and optionally solving the adjoint or providing Jacobian-transpose products for variational methods.

class SimulationModelConfigs(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', model_name=None, screen_output_iter=1, file_output_iter=1)[source]#

Bases: PyOEDConfigs

Configurations class for the SimulationModel abstract base class. This class inherits functionality from PyOEDConfigs and only adds new class-level variables which can be updated as needed.

See PyOEDConfigs for more details on the functionality of this class along with a few additional fields. Otherwise SimulationModelConfigs provides the following fields.

Parameters:
  • verbose (bool) – a boolean flag to control verbosity of the object.

  • debug (bool) – a boolean flag that enables adding extra functionality in a debug mode

  • output_dir (str | Path) – the base directory where the output files will be saved.

  • model_name (str | None) – name of the model. Default is None.

  • screen_output_iter (int) – iteration interval for screen output. Default is 1. Note that this should be a positive integer to enforce proper effect.

  • file_out_iter – iteration interval for file output. Default is 1. Note that this should be a positive integer to enforce proper effect.

model_name: str | None#
screen_output_iter: int#
file_output_iter: int#
__init__(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', model_name=None, screen_output_iter=1, file_output_iter=1)#
class SimulationModel(configs=None)[source]#

Bases: PyOEDObject

Abstract class (following Python’s abc convention) for Simulation models (both time-dependent and time-independent)’ (wrappers’) implementation.

The implementation in classes inheriting this base class MUST carry out all essential tasks (marked with abstractmethod decorators) which in turn should be provided by dynamical model.

Note

Each class derived from SimulationModel should have its own __init__ method in which the constructor just calls super().__init__(configs=configs) and then add any additional initialization as needed. The validation self.validate_configurations() is carried out at the initialization time by the base class SimulationModel. See for example, the __init__ method of TimeIndependentModel.

Note

The structure is similar to DATeS base class for simulation models.

Parameters:

configs (dict | SimulationModelConfigs | None) – (optional) configurations for the model

__init__(configs=None)[source]#
validate_configurations(configs, raise_for_invalid=True)[source]#

Each simulation model SHOULD implement it’s own function that validates its own configurations. If the validation is self contained (validates all configuations), then that’s it. However, one can just validate the configurations of of the immediate class and call super to validate configurations associated with the parent class.

If one does not wish to do any validation (we strongly advise against that), simply add the signature of this function to the model class.

Note

The purpose of this method is to make sure that the settings in the configurations object self._CONFIGURATIONS are of the right type/values and are conformable with each other. This function is called upon instantiation of the object, and each time a configuration value is updated. Thus, this function need to be inexpensive and should not do heavy computations.

Parameters:

configs (dict | SimulationModelConfigs) – configurations to validate. If a SimulationModelConfigs object is passed, validation is performed on the entire set of configurations. However, if a dictionary is passed, validation is performed only on the configurations corresponding to the keys in the dictionary.

Raises:
  • PyOEDConfigsValidationError – if the configurations are invalid and raise_for_invalid is set to True.

  • AttributeError – if any (or a group) of the configurations does not exist in the model configurations ToyLinearTimeIndependentConfigs.

get_model_grid()[source]#
property model_grid#

Retrieve a copy of the model grid (enumeration of all model grid coordinates as they are ranked in a model state vector) as an array

property state_size#

An example to return the model state size; this should be overwridden by a more efficient implementation that does not require building a full state vector

property parameter_size#

An example to return the model parameter size; this should be overwridden by a more efficient implementation that does not require building a full parameter vector

class TimeIndependentModelConfigs(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', model_name=None, screen_output_iter=1, file_output_iter=1)[source]#

Bases: SimulationModelConfigs

Configuration dataclass for the TimeIndependentModel abstract base class. This class mirrors SimulationModelConfigs and does not add additional features/attributes.

Parameters:
  • verbose (bool) – a boolean flag to control verbosity of the object.

  • debug (bool) – a boolean flag that enables adding extra functionality in a debug mode

  • output_dir (str | Path) – the base directory where the output files will be saved.

  • model_name (str | None) – name of the model. Default is None.

  • screen_output_iter (int) – iteration interval for screen output. Default is 1. Note that this should be a positive integer to enforce proper effect.

  • file_out_iter – iteration interval for file output. Default is 1. Note that this should be a positive integer to enforce proper effect.

model_name: str | None#
screen_output_iter: int#
file_output_iter: int#
class TimeIndependentModel(configs)[source]#

Bases: SimulationModel

Base class for time-independent models (such as tomography, ptychography, etc.)

The implementation in classes inheriting this base class MUST carry out all essential tasks (marked with abstractmethod decorators) which in turn should be provided by dynamical model.

Parameters:

configs ([dict | SimulationModelConfigs | None]) – (optional) configurations for the model

__init__(configs)[source]#
validate_configurations(configs, raise_for_invalid=True)[source]#

Each simulation model SHOULD implement it’s own function that validates its own configurations. If the validation is self contained (validates all configuations), then that’s it. However, one can just validate the configurations of of the immediate class and call super to validate configurations associated with the parent class.

If one does not wish to do any validation (we strongly advise against that), simply add the signature of this function to the model class.

Note

The purpose of this method is to make sure that the settings in the configurations object self._CONFIGURATIONS are of the right type/values and are conformable with each other. This function is called upon instantiation of the object, and each time a configuration value is updated. Thus, this function need to be inexpensive and should not do heavy computations.

Parameters:

configs (dict | TimeIndependentModelConfigs) – configurations to validate. If a SimulationModelConfigs object is passed, validation is performed on the entire set of configurations. However, if a dictionary is passed, validation is performed only on the configurations corresponding to the keys in the dictionary.

Raises:
  • PyOEDConfigsValidationError – if the configurations are invalid and raise_for_invalid is set to True.

  • AttributeError – if any (or a group) of the configurations does not exist in the model configurations ToyLinearTimeIndependentConfigs.

abstractmethod state_vector(init_val=0, **kwargs)[source]#

Create an instance of model state vector.

Parameters:

init_val (float) – (optional) value assigned to entries of the state vector upon initialization.

Returns:

a numpy array representing the model state vector.

Return type:

numpy.ndarray

abstractmethod parameter_vector(init_val=0, **kwargs)[source]#

Create an instance of model parameter vector.

Parameters:

init_val (float) – (optional) value assigned to entries of the parameter vector upon initialization.

Returns:

a numpy array representing the model parameter vector.

Return type:

numpy.ndarray

abstractmethod is_state_vector(state, **kwargs)[source]#

Test whether the passed state vector is valid or not.

Parameters:

state – candidate state vector to validate.

Returns:

True if state is a valid state vector, False otherwise.

Return type:

bool

abstractmethod is_parameter_vector(parameter, **kwargs)[source]#

Test whether the passed parameter vector is valid or not.

Parameters:

parameter – candidate parameter vector to validate.

Returns:

True if parameter is a valid parameter vector, False otherwise.

Return type:

bool

abstractmethod solve_forward(state, verbose=False)[source]#

Apply (solve the forward model) to the given state, and return the result.

Parameters:
  • state – state to which the forward model is applied.

  • verbose (bool) – flag to control screen-verbosity.

Returns:

result (usually an observation vector) resulting from applying the forward model to state.

Return type:

numpy.ndarray

apply(*args, **kwargs)[source]#

Calls self.solve_forward().

solve_adjoint(adjoint)[source]#

Solve the adjoint problem (solve the model backward) to the given adjoint, and return the result.

Parameters:

adjoint – adjoint vector to which the adjoint model is applied.

Returns:

result resulting from applying the adjoint model to adjoint.

Return type:

numpy.ndarray

Jacobian_T_matvec(state, eval_at)[source]#

Evaluate and return the product of the Jacobian (of the right-hand-side) of the model (TLM) transposed, by a model state.

Parameters:
  • state – state to multiply the Jacobian by.

  • eval_at – state around which the Jacobian is evaluated.

Returns:

the Jacobian-transpose-vector product.

Return type:

numpy.ndarray

class TimeDependentModelConfigs(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', model_name=None, screen_output_iter=1, file_output_iter=1, time_integration=<factory>, num_prognostic_variables=None, space_discretization=<factory>)[source]#

Bases: SimulationModelConfigs

Configuration dataclass for the TimeDependentModel base class.

Parameters:
  • verbose (bool) – a boolean flag to control verbosity of the object.

  • debug (bool) – a boolean flag that enables adding extra functionality in a debug mode

  • output_dir (str | Path) – the base directory where the output files will be saved.

  • model_name (str | None) – name of the model. Default is None.

  • screen_output_iter (int) – iteration interval for screen output. Default is 1. Note that this should be a positive integer to enforce proper effect.

  • file_out_iter – iteration interval for file output. Default is 1. Note that this should be a positive integer to enforce proper effect.

  • time_integration (dict | None) –

    dictionary holding time integration configurations:

    • ’scheme’: string specifying the time integration scheme (e.g., ‘RK4’, ‘RK45’, ‘BDF’, etc.). Default is None.

    • ’stepsize’: float specifying the time integration stepsize. Default is None.

    • ’adaptive’: bool specifying whether the time integration is adaptive. Default is False.

  • num_prognostic_variables (int | None) – number of prognostic variables in the model. Default is None. Must be a positive integer if not None.

  • space_discretization (dict | None) –

    dictionary holding space discretization configurations. Contains:

    • ’scheme’: string specifying the space discretization scheme (e.g., ‘FD’, ‘FE’, ‘BE’, etc.). Default is None.

time_integration: dict | None#
num_prognostic_variables: int | None#
space_discretization: dict | None#
__init__(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', model_name=None, screen_output_iter=1, file_output_iter=1, time_integration=<factory>, num_prognostic_variables=None, space_discretization=<factory>)#
class TimeDependentModel(configs=None)[source]#

Bases: SimulationModel

Abstract base class for time-dependent (dynamical) simulation models.

Models inheriting this class represent dynamical systems that evolve in time. They must implement methods for creating state vectors, testing state validity, and integrating the state forward in time over a given time span.

Concrete implementations include Lorenz-63/96, advection-diffusion, and Bateman-Burgers models.

The implementation in classes inheriting this base class MUST carry out all essential tasks (marked with abstractmethod decorators) which in turn should be provided by the dynamical model.

The structure is similar to DATeS base class for simulation models.

Parameters:

configs (dict | TimeDependentModelConfigs | None) – (optional) configurations for the model.

__init__(configs=None)[source]#
validate_configurations(configs, raise_for_invalid=True)[source]#

Validation stage for the the passed configs.

Parameters:

configs (dict | TimeDependentModelConfigs) – configurations to validate. If a TimeDependentModelConfigs object is passed, validation is performed on the entire set of configurations. However, if a dictionary is passed, validation is performed only on the configurations corresponding to the keys in the dictionary.

Raises:
  • PyOEDConfigsValidationError – if the configurations are invalid and raise_for_invalid is set to True.

  • AttributeError – if any (or a group) of the configurations does not exist in the model configurations ToyLinearTimeIndependentConfigs.

abstractmethod state_vector(t=None, init_val=0, **kwargs)[source]#

Create an instance of model state vector

Parameters:

t – (optional) time assigned to the state vector (for time dependent models); None otherwise

abstractmethod is_state_vector(state, **kwargs)[source]#

Test whether the passed state vector is valid or not

abstractmethod integrate_state(state, tspan, checkpoints, *argv, **kwargs)[source]#
Simulate/integrate the mdoel starting from the initial state over the passed checkpoints.

If state is assigned a time t, it is replaced with the first entry of checkpoints.

Parameters:
  • state – data structure holding the initial model state

  • tspan – (t0, tf) iterable with two entries specifying of the time integration window

  • checkpoints – times at which to store the computed solution, must be sorted and lie within tspan. If None (default), use points selected by the solver [t0, t1, …, tf].

Returns:

timespan and a trajectory (checkpointed solution): - the timespan is an iterable (e.g., a list) hodlding timepoints at which state is propagated, - the trajectory is an iterable (e.g., a list) holding the model trajectory with entries corresponding to the simulated model state at entries of checkpoints starting from checkpoints[0] and ending at checkpoints[-1].

Jacobian_T_matvec(state, eval_at_t, eval_at)[source]#
Evaluate and return the product of the Jacobian (of the right-hand-side) of

the model (TLM) transposed, by a model state.

Parameters:
  • state – state to multiply the Jacobian by

  • eval_at_t – time at which the Jacobian is evaluated

  • eval_at – state around which the Jacobian is evaluated

class ErrorModelConfigs(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', design=True)[source]#

Bases: PyOEDConfigs

Base configuration class for error models

Parameters:
  • verbose (bool) – a boolean flag to control verbosity of the object.

  • debug (bool) – a boolean flag that enables adding extra functionality in a debug mode

  • output_dir (str | Path) – the base directory where the output files will be saved.

  • design (None | bool | Sequence[bool] | ndarray) –

    an experimental design to define active/inactive entries of the random variable (mean, variance/covariance matrix).

    Note

    • If the design is None, it is set to all ones; that is everything is observed (default)

    • If the design is a binary vector ( or int dtype attributes with 0/1 entries) the mean, the covariance, and all random vectors are projected onto the space identified by the 1/True entries.

design: None | bool | Sequence[bool] | ndarray#
__init__(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', design=True)#
class ErrorModel(configs=None)[source]#

Bases: PyOEDObject

Abstract base class for error/noise models.

An error model represents the observation noise distribution used in data assimilation and optimal experimental design. It provides methods for sampling noise, evaluating the probability density function (PDF) and its gradient, and computing log-densities.

Subclasses must implement generate_noise() and sample(). Optionally, subclasses can implement pdf(), pdf_gradient(), log_density(), and log_density_gradient() for use in variational methods.

The error model supports an experimental design vector that activates/deactivates individual observation components, and may support relaxed (non-binary) designs depending on the implementation.

Parameters:

configs – configuration object or dictionary. See ErrorModelConfigs.

__init__(configs=None)[source]#
validate_configurations(configs, raise_for_invalid=True)[source]#

Each simulation model MUST implement it’s own function that validates its own configurations. If the validation is self contained (validates all configuations), then that’s it. However, one can just validate the configurations of of the immediate class and call super to validate configurations associated with the parent class.

If one does not wish to do any validation (we strongly advise against that), simply add the signature of this function to the model class.

Note

The purpose of this method is to make sure that the settings in the configurations object self._CONFIGURATIONS are of the right type/values and are conformable with each other. This function is called upon instantiation of the object, and each time a configuration value is updated. Thus, this function need to be inexpensive and should not do heavy computations.

Parameters:

configs (dict | PyOEDConfigs) – configurations to validate. If a PyOEDConfigs object is passed, validation is performed on the entire set of configurations. However, if a dictionary is passed, validation is performed only on the configurations corresponding to the keys in the dictionary.

Raises:
  • PyOEDConfigsValidationError – if the configurations are invalid and raise_for_invalid is set to True.

  • AttributeError – if any (or a group) of the configurations does not exist in the model configurations PyOEDConfigs.

update_configurations(**kwargs)[source]#

Take any set of keyword arguments, and lookup each in the configurations, and update as nessesary/possible/valid

Raises:

TypeError – if any of the passed keys in kwargs is invalid/unrecognized

update_design(design)[source]#

Update the experimental design.

Parameters:

design – the design to be used update model’s configurations

Raises:

PyOEDConfigsValidationError – if the passed design is of invalid type/shape/size

pdf(*args, **kwargs)[source]#

Evaluate the value of the density function (normalized or up to a fixed scaling constant) at the passed state/vector.

Parameters:
  • args – positional arguments (implementation-specific, typically a state vector).

  • kwargs – keyword arguments (implementation-specific).

Returns:

the PDF value at the given point.

Return type:

float

pdf_gradient(*args, **kwargs)[source]#

Evaluate the gradient of the density function at the passed state/vector.

Parameters:
  • args – positional arguments (implementation-specific, typically a state vector).

  • kwargs – keyword arguments (implementation-specific).

Returns:

the gradient of the PDF at the given point.

Return type:

numpy.ndarray

log_density(*args, **kwargs)[source]#

Evaluate the logarithm of the density function at the passed state x.

Parameters:
  • args – positional arguments (implementation-specific, typically a state vector).

  • kwargs – keyword arguments (implementation-specific).

Returns:

the log-density value at the given point.

Return type:

float

log_density_gradient(*args, **kwargs)[source]#

Evaluate the gradient of the logarithm of the density function at the passed state.

Parameters:
  • args – positional arguments (implementation-specific, typically a state vector).

  • kwargs – keyword arguments (implementation-specific).

Returns:

the gradient of the log-density at the given point.

Return type:

numpy.ndarray

copy()[source]#

Return a shallow copy of this observation operator by creating a new instance of this object with registered the configurations. NO DEEP COPY of the objects in the configurations dictionary is made.

Note

One can create a deepcopy method but this will be generally problematic as the objects in the configurations dictionary might not be serializable, and we have to deel with errors as well. This is handleded in some way by the parameters of the pyoed.configs.PyOEDCondfigs.asdict() used here.

abstractmethod generate_noise()[source]#

Generate a random noise vector sampled from the underlying distribution.

The noise vector is typically zero-mean (pure noise without a drift term). Concrete implementations use their own random number generator and must return a 1-D numpy.ndarray.

The length of the returned vector depends on the projection setting:

Returns:

a randomly sampled noise vector.

Return type:

numpy.ndarray

abstractmethod sample()[source]#

Sample a random vector from the full distribution (mean + noise).

Unlike generate_noise(), which returns additive noise centered at zero, this method samples a complete random vector from the underlying distribution, i.e. it includes any non-zero mean.

The length of the returned vector depends on the projection setting:

Returns:

a randomly sampled vector from the distribution.

Return type:

numpy.ndarray

property supports_relaxation#

Flag to check whether the model supports relaxation or not

property project_onto_active_design_space#

All operations assume a projection onto the active subspace or not. This means that any zero entry of the design vector is ommitted.

property design#

Get a COPY of the design vector

abstract property size#

Dimension of the underlying probability distribution

property active#

Flags (boolean 1d array-like) corresponding to active/inactive dimensions according to the experimental design. This vector is of exactly the same size as the observation vector (ignoring the experimental design).

Warning

Because the return type is bool, this vector can be used to slice the vector. To avoid any confustion, however, we decided to add a more clear property active_indexes which returns the indexes of the variable/vector corresponding to non-zero design values. Thus, one should always resort to using active_indexes in case extracting active entries in the variable (according to the design) is required.

property active_indexes#

An array of the indexes of the observation vector that is active (associated with non-zero design) according to the experimental design.

abstract property active_size#

Dimension of the probability distribution projected onto the design space (only active/nonzero entries of the design)

property random_seed#

Registered random seed (if available in configurations)

design_manager(design)[source]#

Creat an return a context manager that enables updating the design to the passed value, execute needed code and then reset the design to the original value.

Assuming obj is this object, and val is the design value one wants to run the method obj.run_code(), the following code can be used

with obj.design_manager(val) as mngr:
    mngr.run_code()
Returns:

a reference to self that enables calling any function under self and then automatically, the design is reset.

property is_time_dependent#

By default all error models are not time dependent. If a model is time dependent they must return True, otherwise no time will be passed to their methods

Warning

This property probably should be replaced with a dynamic property where a test is carried out to assure whether a model is time-dependent or not.

class TimeDependentErrorModelConfigs(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', design=None, base_error_model=None, base_error_model_configs=None)[source]#

Bases: ErrorModelConfigs

Configuration settings for TimeDependentErrorModel.

Note

This implementation only allows time-dependent experimental design, but does not allow time-dependent moments (e.g., mean/variance) since the model is agnostic to those at this point. We can consider setters/getters __setattr__ __getattr__ but this will complicate things at the beginning. This will evolve and decisions will be made!

Parameters:
  • verbose (bool) – a boolean flag to control verbosity of the object.

  • debug (bool) – a boolean flag that enables adding extra functionality in a debug mode

  • output_dir (str | Path) – the base directory where the output files will be saved.

  • design (dict | None) – a time-dependent experimental design to define active/inactive entries as a function of time. This is a dictionary with keys set to checkpoints at which the corresponding value is the experimental design is passed to the underlying base_error_model for evaluation.

  • base_error_model (ErrorModel | Type[ErrorModel] | None) –

    the error model used for all time points:

    • An error model instance (object that inherits :py:class`ErrorModel`). In this case, the error model is registered as is and is updated with the passed configurations if available.

    • The class (subclass of ErrorModel) to be used to instantiate the error model.

  • base_error_model_config – the configurations of the base error model. If not passed, default configurations are used.

base_error_model: ErrorModel | Type[ErrorModel] | None#
base_error_model_configs: ErrorModelConfigs | Type[ErrorModelConfigs] | dict | None#
__init__(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', design=None, base_error_model=None, base_error_model_configs=None)#
class TimeDependentErrorModel(configs=None)[source]#

Bases: ErrorModel

This class provides a general implementation for time-dependent error models.

Note

By time-dependence we initially mean that the design changes over time. We can extend that to the case where the moments of the distribution, or even the error models themselves vary over time. In this case, we can replace self.configurations.design with something like self._DATA begin a dictionary indexed by time indexes with values set to the corresponding error model. This, however, will be a bit more expensive, and will be thought over time.

Note

The implementation here is agnostic to the specific choice of the underlying error models, and only allows its configurations (specifically the design) to vary over time.

__init__(configs=None)[source]#
validate_configurations(configs, raise_for_invalid=True)[source]#

Validate passed configurations.

Parameters:

configs (dict | TimeDependentErrorModelConfigs) – configurations to validate. If a TimeDependentErrorModelConfigs object is passed, validation is performed on the entire set of configurations. However, if a dictionary is passed, validation is performed only on the configurations corresponding to the keys in the dictionary.

update_configurations(**kwargs)[source]#

Take any set of keyword arguments, and lookup each in the configurations, and update as necessary/possible/valid

Raises:

PyOEDConfigsValidationError – if any of the passed keys in kwargs is invalid/unrecognized

update_design(design)[source]#

Update the experimental design in the configurations object. The design must be None, or a dictionary containing keys as valid times (non-negative floats), and values holding valid experimental designs those can be associated with the underlying base error model. If the base error model is registered, the designs in the passed dictionary are validated against it.

Parameters:

design – the design to be used update error model’s configurations

Raises:

PyOEDConfigsValidationError – if the passed design is of invalid type/shape/size

update_base_error_model(base_error_model, base_error_model_configs=None)[source]#

Update the existing base error model with optional configurations. The model is tested against configured design.

Parameters:
  • base_error_model (ErrorModel | Type[ErrorModel]) –

    the error model used for all time points:

    • An error model instance (object that inherits :py:class`ErrorModel`). In this case, the error model is registered as is and is updated with the passed configurations if available.

    • The class (subclass of ErrorModel) to be used to instantiate the error model.

  • base_error_model_configs (ErrorModelConfigs | Type[ErrorModelConfigs] | None) – the configurations of the base error model. If not passed, default configurations are used.

register_time(t, design, overwrite=False)[source]#

Register a design at a time instance t.

Parameters:
  • t – the time at which the design is registered.

  • design – the design to register at the time t.

  • overwrite – if the time t exists, overwrite the corresponding/current design. If False and the time exists, a ValueError exception is raised

Raises:

ValueError – if the time t is registered and overwrite is False.

check_registered_time(t)[source]#

Check if the passed time is registered or not. If available, return the registered time, otherwise raise ValueError.

Parameters:

t – the time to check if registered or not.

Raises:

ValueError – if the time t is not registered.

generate_noise(t)[source]#

Generate a random noise vector sampled from the underlying base error model

sample(t)[source]#

Sample a random vector from the underlying base error model

property size#

Dimension of the probability distribution associated with the underlying base model

property design#

The experimental design dictionary

property checkpoints#

The times associated with the registered experimental design. The checkpoints are ordered in ascending order.

property base_error_model#

Reference to the configured base error model

property time_precision#

The number of digits used for rounding to maintain predefined precision in time comparisons. This depends on the class variable _TIME_EPS which defines the floating-point accuracy.

Note

The sole purpose of this attribute is to apply np.round(t, time_precision) so that we can properly store/lookup keys in the time-dependent objects (design and checkpoints).

property active#

Return a dictionary with keys identical to self.design, with values obtained by finding active entries (as bool )in the designs associated with each of the checkpoints.

property active_size#

Dimension of the probability distribution projected onto the design space (only active/nonzero entries of the design) for all time points.

property active_indexes#

A dictionary holding arrays (for each registered time) of the indexes of the observation vector that is active (associated with non-zero design) according to the experimental design.

property random_seed#

Registered random seed (if available in configurations)

property is_time_dependent#

This model is time-dependent. This property will allow working with these models in the core modules without creating circular imports

property project_onto_active_design_space#

All operations assume a projection onto the active subspace or not. This operator follows the strategy in the underlying base error model

Base classes for observation operators.

An observation operator transforms a model state vector into an observation vector. This module provides the abstract interfaces for both time-independent (ObservationOperator) and time-dependent (TimeDependentObservationOperator) observation operators, with built-in support for experimental design (binary sensor activation/deactivation).

class ObservationOperatorConfigs(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', design=None)[source]#

Bases: PyOEDConfigs

Base configuration class for observation operators.

Parameters:
  • verbose (bool) – a boolean flag to control verbosity of the object.

  • debug (bool) – a boolean flag that enables adding extra functionality in a debug mode

  • output_dir (str | Path) – the base directory where the output files will be saved.

  • design (None | bool | Sequence[bool] | ndarray[bool]) –

    an experimental design to define active/inactive entries of the observation vector.

    Note

    An experimental design here is always binary (on/off) indicating which entries of the design space are active/inactive. This corresponds to turning on/off some of the observational sensors. Thus, if the design is None, it is set to all ones; that is everything is observed

design: None | bool | Sequence[bool] | ndarray[bool]#
__init__(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', design=None)#
class ObservationOperator(configs=None)[source]#

Bases: PyOEDObject

Abstract base class for observation operators.

An observation operator transforms a model state vector into an observation vector. It encapsulates the mapping \(\mathcal{H}: \mathbb{R}^{n_x} \to \mathbb{R}^{n_y}\) where \(n_x\) is the state dimension and \(n_y\) is the observation dimension.

The operator supports an experimental design (binary vector) that activates/deactivates individual observation components, effectively projecting onto a subspace of the full observation space.

Parameters:

configs (ObservationOperatorConfigs | dict | None) – configuration object or dictionary. See ObservationOperatorConfigs.

__init__(configs=None)[source]#
validate_configurations(configs, raise_for_invalid=True)[source]#

Check the passed configuratios and make sure they are conformable with each other, and with current configurations once combined. This guarantees that any key-value pair passed in configs can be properly used

Parameters:
  • configs (dict) – a dictionary holding key/value configurations

  • raise_for_invalid (bool) – if True raise TypeError for invalid configrations type/key

Returns:

bool flag indicating whether passed coinfigurations dictionary is valid or not

Raises:

see the parameter raise_for_invalid

Return type:

bool

observation_vector(init_val=0.0)[source]#

Create an observation vector

Parameters:

init_val (float) – (optional) value assigned to entries of the state vector upon initialization

Returns:

an observation vector identical to state vectors generated by the underlying model

Note

This method creates an observation vector with the size equal to either the full observation vector (ignoring the design) or with only number of entries equal to the number of active entries in the design. This is determined based on how PyOED is configured globally through pyoed.SETTINGS.PROJECT_ONTO_ACTIVE_DESIGN_SPACE which is referenced by the underlying method project_onto_active_design_space().

is_observation_vector(observation, ignore_design=False)[source]#

Test whether the passed observation vector is valid or not. The observation is valid if it is a 1d numpy array (when squeezed) with size equal to the underlying observation size. The observation size is determined based on whether PyOED is configured to project observations on the design space (default) or not as given by pyoed.SETTINGS.PROJECT_ONTO_ACTIVE_DESIGN_SPACE which is referenced by the underlying method project_onto_active_design_space(). This is ignored when ignore_design is set to True.

Note

The flag ignore_design is added for two main purposes:
  1. greater flexibility,

  2. validation of observations when data assimilation is required. Specifically, in DA, we need the full observation vector always as the design can change and we still want to solve the inverse problem with different designs. This can be done either be re-registering the observations with different designs, or keeping the full observation and only extracting the relevant parts.

Parameters:
  • observation – an observation vector

  • ignore_design – if True the experimental design is ignored while validating the passed observation. In this case, the vector is expected to be of size equal to the dimension of the observation space when the design is all ones.

update_design(design)[source]#

Update the experimental design.

Parameters:

design – the design to be used update model’s configurations

Raises:

PyOEDConfigsValidationError – if the passed design is of invalid type/shape/size

validate_observation_vector(observation, ignore_design=False)[source]#

Validate that observation is a conformable observation vector, raising on failure.

This is a strict variant of is_observation_vector(): it raises a TypeError when the observation is invalid instead of returning False. Used internally before computations that require a valid observation.

Parameters:
  • observation (numpy.ndarray) – the candidate observation array to validate.

  • ignore_design (bool) – when True, the check ignores the experimental design and requires the observation to have size equal to the full observation space (all sensors active). Defaults to not pyoed.SETTINGS.PROJECT_ONTO_ACTIVE_DESIGN_SPACE.

Raises:

TypeError – if observation is not a valid 1-D array of the expected size.

update_configurations(**kwargs)[source]#

Take any set of keyword arguments, and lookup each in the configurations, and update as nessesary/possible/valid

Raises:

TypeError – if any of the passed keys in kwargs is invalid/unrecognized

Jacobian_T_matvec(observation, eval_at, **kwargs)[source]#

Compute the adjoint (transpose-Jacobian) action on an observation vector.

Evaluates \(\mathbf{H}(\text{eval\_at})^T \, \text{observation}\), where \(\mathbf{H}\) is the Jacobian (tangent-linear model, TLM) of this observation operator evaluated at eval_at.

This method is required by gradient-based variational assimilation algorithms (e.g., 4D-Var) to backpropagate observation-space residuals into the state space. It is not needed by ensemble-based methods (e.g., EnKF).

Parameters:
  • observation (numpy.ndarray) – a vector in observation space (size observation_size), typically the innovation or adjoint forcing.

  • eval_at (numpy.ndarray) – the model state at which the Jacobian is evaluated (size equal to the state dimension \(n_x\)).

  • kwargs – additional keyword arguments (implementation-specific).

Returns:

the adjoint action vector in state space (size \(n_x\)).

Return type:

numpy.ndarray

Raises:

NotImplementedError – always, unless overridden by a concrete subclass.

copy()[source]#

Return a shallow copy of this observation operator by creating a new instance of this object with registered the configurations. NO DEEP COPY of the objects in the configurations dictionary is made.

Note

One can create a deepcopy method but this will be generally problematic as the objects in the configurations dictionary might not be serializable, and we have to deel with errors as well. This is handleded in some way by the parameters of the pyoed.configs.PyOEDCondfigs.asdict() used here.

abstractmethod apply(state, **kwargs)[source]#

Apply the observation operator to a model state vector.

Maps the state vector from the model space (\(\mathbb{R}^{n_x}\)) to the observation space (\(\mathbb{R}^{n_y}\)). When the experimental design is active and project_onto_active_design_space is True, the returned vector contains only the components corresponding to active (non-zero design) sensors.

Parameters:
  • state (numpy.ndarray) – the model state vector of size \(n_x\).

  • kwargs – additional keyword arguments (implementation-specific).

Returns:

the observation vector of size observation_size (either active_size or the full \(n_y\) depending on project_onto_active_design_space).

Return type:

numpy.ndarray

property supports_relaxation#

Flag to check whether the observation operator supports relaxation or not. This should always be False. It is only added for consistency with the base error model.

property observation_size: int#

Return the observation vector size.

property active#

Flags (boolean 1d array-like) corresponding to active/inactive dimensions according to the experimental design. This vector is of exactly the same size as the observation vector (ignoring the experimental design).

Warning

Because the return type is bool, this vector can be used to slice the vector. To avoid any confustion, however, we decided to add a more clear property active_indexes which returns the indexes of the variable/vector corresponding to non-zero design values. Thus, one should always resort to using active_indexes in case extracting active entries in the variable (according to the design) is required.

property active_indexes#

An array of the indexes of the observation vector that is active (associated with non-zero design) according to the experimental design.

property design#

Get a COPY of the design vector

abstract property shape#

The shape of the observation operator discarding any experimental design. An observation operator maps the model state onto the observation space. Thus, the shape of the observation operator here is defined as a tuple on the form \((ny, nx)\) where \(ny\) is the size of the observation vector, and nx is the size of the state vector.

Note

The shape of the operator is INDEPENDENT from the design and defines the codomain of the operator \(ny\) assuming all entries of the design are active. To get the shape of the operator corresponding to the active entries of the design only use active_shape().

property active_shape#

The shape of the observation operator discarding any experimental design. An observation operator maps the model state onto the observation space. Thus, the shape of the observation operator here is defined as a tuple on the form \((ny, nx)\) where \(ny\) is the size of the observation vector, and nx is the size of the state vector.

Note

The shape of the operator is INDEPENDENT from the design and defines the codomain of the operator \(ny\) assuming all entries of the design are active. To get the shape of the operator corresponding to the active entries of the design only use active_shape().

property project_onto_active_design_space#

All operations assume a projection onto the active subspace or not. This means that any zero entry of the design vector is ommitted.

design_manager(design)[source]#

Creat an return a context manager that enables updating the design to the passed value, execute needed code and then reset the design to the original value.

Assuming obj is this object, and val is the design value one wants to run the method obj.run_code(), the following code can be used

with obj.design_manager(val) as mngr:
    mngr.run_code()
Returns:

a reference to self that enables calling any function under self and then automatically, the design is reset.

property is_time_dependent#

By default all observation operator not time dependent. If a model is time dependent they must return True, otherwise no time will be passed to their methods

Warning

This property probably should be replaced with a dynamic property where a test is carried out to assure whether a model is time-dependent or not.

class TimeDependentObservationOperatorConfigs(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', design=None, base_observation_operator=None, base_observation_operator_configs=None)[source]#

Bases: ObservationOperatorConfigs

Configuration settings for TimeDependentObservationOperator.

Note

This implementation only allows time-dependent experimental design, but does not allow time-dependent moments (e.g., mean/variance) since the model is agnostic to those at this point. We can consider setters/getters __setattr__ __getattr__ but this will complicate things at the beginning. This will evolve and decisions will be made!

Parameters:
  • verbose (bool) – a boolean flag to control verbosity of the object.

  • debug (bool) – a boolean flag that enables adding extra functionality in a debug mode

  • output_dir (str | Path) – the base directory where the output files will be saved.

  • design (dict | None) – a time-dependent experimental design to define active/inactive entries as a function of time. This is a dictionary with keys set to checkpoints at which the corresponding value is the experimental design is passed to the underlying base_observation_operator for evaluation.

  • base_observation_operator (ObservationOperator | Type[ObservationOperator] | None) –

    the observation operator used for all time points:

    • An observation operator instance (object that inherits ObservationOperator). In this case, the observation operator is registered as is and is updated with the passed configurations if available.

    • The class (subclass of ObservationOperator) to be used to instantiate the observation operator.

  • base_observation_operator_config – the configurations of the base observation operator. If not passed, default configurations are used.

base_observation_operator: ObservationOperator | Type[ObservationOperator] | None#
base_observation_operator_configs: ObservationOperatorConfigs | Type[ObservationOperatorConfigs] | dict | None#
__init__(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', design=None, base_observation_operator=None, base_observation_operator_configs=None)#
class TimeDependentObservationOperator(configs=None)[source]#

Bases: ObservationOperator

This class provides a general implementation for time-dependent observation operators.

Note

By time-dependence we initially mean that the design changes over time. We can extend that to the case where the moments of the distribution, or even the observation operators themselves vary over time. In this case, we can replace self.configurations.design with something like self._DATA begin a dictionary indexed by time indexes with values set to the corresponding observation operator. This, however, will be a bit more expensive, and will be thought over time.

Note

The implementation here is agnostic to the specific choice of the underlying observation operators, and only allows its configurations (specifically the design) to vary over time.

__init__(configs=None)[source]#
validate_configurations(configs, raise_for_invalid=True)[source]#

Validate passed configurations.

Parameters:

configs (dict | TimeDependentObservationOperatorConfigs) – configurations to validate. If a TimeDependentObservationOperatorConfigs object is passed, validation is performed on the entire set of configurations. However, if a dictionary is passed, validation is performed only on the configurations corresponding to the keys in the dictionary.

update_configurations(**kwargs)[source]#

Take any set of keyword arguments, and lookup each in the configurations, and update as necessary/possible/valid

Raises:

PyOEDConfigsValidationError – if any of the passed keys in kwargs is invalid/unrecognized

update_design(design)[source]#

Update the experimental design in the configurations object. The design must be None, or a dictionary containing keys as valid times (non-negative floats), and values holding valid experimental designs those can be associated with the underlying base observation operator. If the base observation operator is registered, the designs in the passed dictionary are validated against it.

Parameters:

design – the design to be used update observation operator’s configurations

Raises:

PyOEDConfigsValidationError – if the passed design is of invalid type/shape/size

update_base_observation_operator(base_observation_operator, base_observation_operator_configs=None)[source]#

Update the existing base observation operator with optional configurations. The model is tested against configured design.

Parameters:
  • base_observation_operator (ObservationOperator | Type[ObservationOperator]) –

    the observation operator used for all time points:

    • An observation operator instance (object that inherits :py:class`ObservationOperator`). In this case, the observation operator is registered as is and is updated with the passed configurations if available.

    • The class (subclass of ObservationOperator) to be used to instantiate the observation operator.

  • base_observation_operator_configs (ObservationOperatorConfigs | Type[ObservationOperatorConfigs] | None) – the configurations of the base observation operator. If not passed, default configurations are used.

register_time(t, design, overwrite=False)[source]#

Register a design at a time instance t.

Parameters:
  • t – the time at which the design is registered.

  • design – the design to register at the time t.

  • overwrite – if the time t exists, overwrite the corresponding/current design. If False and the time exists, a ValueError exception is raised

Raises:

ValueError – if the time t is registered and overwrite is False.

check_registered_time(t)[source]#

Check if the passed time is registered or not. If available, return the registered time, otherwise raise ValueError.

Parameters:

t – the time to check if registered or not.

Raises:

ValueError – if the time t is not registered.

apply(state, eval_at_t)[source]#

Given a state vector (laid out as defined by the model for which observation operator is defined), evaluate the observation vector.

Parameters:
  • state – the model state vector to observe.

  • eval_at_t – the time at which the observation operator is evaluated.

Returns:

an observation vector/instance

Raises:

ValueError – if the time t is not registered.

observation_vector(t, init_val=0.0)[source]#

Create an observation vector by employing the observation oeprator settings at time instance t.

Note

This method creates an observation vector with the size equal to either the full observation vector (ignoring the design) or with only number of entries equal to the number of active entries in the design. This is determined based on how PyOED is configured globally through pyoed.SETTINGS.PROJECT_ONTO_ACTIVE_DESIGN_SPACE which is referenced by the underlying method project_onto_active_design_space().

Parameters:
  • t (float) – observation time.

  • init_val (float) – (optional) value assigned to entries of the state vector upon initialization

Returns:

an observation vector identical to state vectors generated by the underlying model

Raises:

ValueError – if the time t is not registered.

is_observation_vector(observation, t, ignore_design=False)[source]#

Test whether the passed observation vector is valid or not given the settings of the observation operator at time instance t. The observation is valid if it is a 1d numpy array (when squeezed) with size equal to the underlying observation size. The observation size is determined based on whether PyOED is configured to project observations on the design space (default) or not as given by pyoed.SETTINGS.PROJECT_ONTO_ACTIVE_DESIGN_SPACE which is referenced by the underlying method project_onto_active_design_space(). This is ignored when ignore_design is set to True.

Note

The flag ignore_design is added for two main purposes:

  1. greater flexibility,

  2. validation of observations when data assimilation is required. Specifically, in DA, we need the full observation vector always as the design can change and we still want to solve the inverse problem with different designs. This can be done either be re-registering the observations with different designs, or keeping the full observation and only extracting the relevant parts.

Parameters:
  • observation – an observation vector

  • t (float) – observation time.

  • ignore_design – if True the experimental design is ignored while validating the passed observation. In this case, the vector is expected to be of size equal to the dimension of the observation space when the design is all ones.

Raises:

ValueError – if the time t is not registered.

Jacobian_T_matvec(observation, eval_at_t, eval_at=None)[source]#

Multiply (Matrix-free matrix-vector product) of transpose of the Jacobian/tangent-linear (TLM) of the observation operator by the passed observation; The Jacobian is evaluated (for nonlinear operators) at eval_at

Parameters:
  • observation – observation instance/vector

  • eval_at – state around which observation operator is linearized.

  • eval_at_t – the time at which the observation operator is evaluated.

Returns:

result of multiplying the transpose of derivative of the observation operator by the passed observation

Raises:

TypeError – if the passed observation is invalid (type/shape/etc.)

validate_observation_vector(observation, t, ignore_design=False)[source]#

Check if the observation vector is valid. If not, raise a TypeError. This is mostly for internl validation.

Raises:

ValueError – if the time t is not registered.

property design#

The experimental design dictionary

property checkpoints#

The times associated with the registered experimental design. The checkpoints are ordered in ascending order.

property base_observation_operator#

Reference to the configured base observation operator

property time_precision#

The number of digits used for rounding to maintain predefined precision in time comparisons. This depends on the class variable _TIME_EPS which defines the floating-point accuracy.

Note

The sole purpose of this attribute is to apply np.round(t, time_precision) so that we can properly store/lookup keys in the time-dependent objects (design and checkpoints).

property observation_size: dict#

Return a dictionary hodling sizes of observation vectors for each registered time.

property shape#

Dimension of the observation space associated with the underlying base model

property active#

Return a dictionary with keys identical to self.design, with values obtained by finding active entries (as bool )in the designs associated with each of the checkpoints.

property active_shape#

The shape of the observation operator discarding any experimental design. An observation operator maps the model state onto the observation space. Thus, the shape of the observation operator here is defined as a tuple on the form \((ny, nx)\) where \(ny\) is the size of the observation vector, and nx is the size of the state vector.

Note

The shape of the operator is INDEPENDENT from the design and defines the codomain of the operator \(ny\) assuming all entries of the design are active. To get the shape of the operator corresponding to the active entries of the design only use active_shape().

property active_indexes#

A dictionary holding arrays (for each registered time) of the indexes of the observation vector that is active (associated with non-zero design) according to the experimental design.

property random_seed#

Registered random seed (if available in configurations)

property is_time_dependent#

This model is time-dependent. This property will allow working with these models in the core modules without creating circular imports