Core (Base Classes) of Models#
Abstract base classes for simulation models.
This module defines the abstract interfaces that all simulation models in PyOED must implement. The hierarchy is:
SimulationModel— root abstract classTimeIndependentModel— for steady-state / static models (e.g., tomography, parameter-to-observable maps)TimeDependentModel— for dynamical models with time integration (e.g., Lorenz-63, advection-diffusion)
Each model must provide methods for creating state/parameter vectors, solving the forward problem, and optionally solving the adjoint or providing Jacobian-transpose products for variational methods.
- class SimulationModelConfigs(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', model_name=None, screen_output_iter=1, file_output_iter=1)[source]#
Bases:
PyOEDConfigsConfigurations class for the
SimulationModelabstract base class. This class inherits functionality fromPyOEDConfigsand only adds new class-level variables which can be updated as needed.See
PyOEDConfigsfor more details on the functionality of this class along with a few additional fields. OtherwiseSimulationModelConfigsprovides the following fields.- Parameters:
verbose (bool) – a boolean flag to control verbosity of the object.
debug (bool) – a boolean flag that enables adding extra functionality in a debug mode
output_dir (str | Path) – the base directory where the output files will be saved.
model_name (str | None) – name of the model. Default is None.
screen_output_iter (int) – iteration interval for screen output. Default is 1. Note that this should be a positive integer to enforce proper effect.
file_out_iter – iteration interval for file output. Default is 1. Note that this should be a positive integer to enforce proper effect.
- __init__(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', model_name=None, screen_output_iter=1, file_output_iter=1)#
- class SimulationModel(configs=None)[source]#
Bases:
PyOEDObjectAbstract class (following Python’s abc convention) for Simulation models (both time-dependent and time-independent)’ (wrappers’) implementation.
The implementation in classes inheriting this base class MUST carry out all essential tasks (marked with abstractmethod decorators) which in turn should be provided by dynamical model.
Note
Each class derived from SimulationModel should have its own __init__ method in which the constructor just calls super().__init__(configs=configs) and then add any additional initialization as needed. The validation
self.validate_configurations()is carried out at the initialization time by the base classSimulationModel. See for example, the __init__ method ofTimeIndependentModel.Note
The structure is similar to DATeS base class for simulation models.
- Parameters:
configs (dict | SimulationModelConfigs | None) – (optional) configurations for the model
- validate_configurations(configs, raise_for_invalid=True)[source]#
Each simulation model SHOULD implement it’s own function that validates its own configurations. If the validation is self contained (validates all configuations), then that’s it. However, one can just validate the configurations of of the immediate class and call super to validate configurations associated with the parent class.
If one does not wish to do any validation (we strongly advise against that), simply add the signature of this function to the model class.
Note
The purpose of this method is to make sure that the settings in the configurations object self._CONFIGURATIONS are of the right type/values and are conformable with each other. This function is called upon instantiation of the object, and each time a configuration value is updated. Thus, this function need to be inexpensive and should not do heavy computations.
- Parameters:
configs (dict | SimulationModelConfigs) – configurations to validate. If a SimulationModelConfigs object is passed, validation is performed on the entire set of configurations. However, if a dictionary is passed, validation is performed only on the configurations corresponding to the keys in the dictionary.
- Raises:
PyOEDConfigsValidationError – if the configurations are invalid and raise_for_invalid is set to True.
AttributeError – if any (or a group) of the configurations does not exist in the model configurations
ToyLinearTimeIndependentConfigs.
- property model_grid#
Retrieve a copy of the model grid (enumeration of all model grid coordinates as they are ranked in a model state vector) as an array
- property state_size#
An example to return the model state size; this should be overwridden by a more efficient implementation that does not require building a full state vector
- property parameter_size#
An example to return the model parameter size; this should be overwridden by a more efficient implementation that does not require building a full parameter vector
- class TimeIndependentModelConfigs(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', model_name=None, screen_output_iter=1, file_output_iter=1)[source]#
Bases:
SimulationModelConfigsConfiguration dataclass for the
TimeIndependentModelabstract base class. This class mirrorsSimulationModelConfigsand does not add additional features/attributes.- Parameters:
verbose (bool) – a boolean flag to control verbosity of the object.
debug (bool) – a boolean flag that enables adding extra functionality in a debug mode
output_dir (str | Path) – the base directory where the output files will be saved.
model_name (str | None) – name of the model. Default is None.
screen_output_iter (int) – iteration interval for screen output. Default is 1. Note that this should be a positive integer to enforce proper effect.
file_out_iter – iteration interval for file output. Default is 1. Note that this should be a positive integer to enforce proper effect.
- class TimeIndependentModel(configs)[source]#
Bases:
SimulationModelBase class for time-independent models (such as tomography, ptychography, etc.)
The implementation in classes inheriting this base class MUST carry out all essential tasks (marked with abstractmethod decorators) which in turn should be provided by dynamical model.
- Parameters:
configs ([dict | SimulationModelConfigs | None]) – (optional) configurations for the model
- validate_configurations(configs, raise_for_invalid=True)[source]#
Each simulation model SHOULD implement it’s own function that validates its own configurations. If the validation is self contained (validates all configuations), then that’s it. However, one can just validate the configurations of of the immediate class and call super to validate configurations associated with the parent class.
If one does not wish to do any validation (we strongly advise against that), simply add the signature of this function to the model class.
Note
The purpose of this method is to make sure that the settings in the configurations object self._CONFIGURATIONS are of the right type/values and are conformable with each other. This function is called upon instantiation of the object, and each time a configuration value is updated. Thus, this function need to be inexpensive and should not do heavy computations.
- Parameters:
configs (dict | TimeIndependentModelConfigs) – configurations to validate. If a SimulationModelConfigs object is passed, validation is performed on the entire set of configurations. However, if a dictionary is passed, validation is performed only on the configurations corresponding to the keys in the dictionary.
- Raises:
PyOEDConfigsValidationError – if the configurations are invalid and raise_for_invalid is set to True.
AttributeError – if any (or a group) of the configurations does not exist in the model configurations
ToyLinearTimeIndependentConfigs.
- abstractmethod state_vector(init_val=0, **kwargs)[source]#
Create an instance of model state vector.
- Parameters:
init_val (float) – (optional) value assigned to entries of the state vector upon initialization.
- Returns:
a numpy array representing the model state vector.
- Return type:
- abstractmethod parameter_vector(init_val=0, **kwargs)[source]#
Create an instance of model parameter vector.
- Parameters:
init_val (float) – (optional) value assigned to entries of the parameter vector upon initialization.
- Returns:
a numpy array representing the model parameter vector.
- Return type:
- abstractmethod is_state_vector(state, **kwargs)[source]#
Test whether the passed state vector is valid or not.
- Parameters:
state – candidate state vector to validate.
- Returns:
Trueif state is a valid state vector,Falseotherwise.- Return type:
- abstractmethod is_parameter_vector(parameter, **kwargs)[source]#
Test whether the passed parameter vector is valid or not.
- Parameters:
parameter – candidate parameter vector to validate.
- Returns:
Trueif parameter is a valid parameter vector,Falseotherwise.- Return type:
- abstractmethod solve_forward(state, verbose=False)[source]#
Apply (solve the forward model) to the given state, and return the result.
- Parameters:
state – state to which the forward model is applied.
verbose (bool) – flag to control screen-verbosity.
- Returns:
result (usually an observation vector) resulting from applying the forward model to
state.- Return type:
- solve_adjoint(adjoint)[source]#
Solve the adjoint problem (solve the model backward) to the given adjoint, and return the result.
- Parameters:
adjoint – adjoint vector to which the adjoint model is applied.
- Returns:
result resulting from applying the adjoint model to adjoint.
- Return type:
- Jacobian_T_matvec(state, eval_at)[source]#
Evaluate and return the product of the Jacobian (of the right-hand-side) of the model (TLM) transposed, by a model state.
- Parameters:
state – state to multiply the Jacobian by.
eval_at – state around which the Jacobian is evaluated.
- Returns:
the Jacobian-transpose-vector product.
- Return type:
- class TimeDependentModelConfigs(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', model_name=None, screen_output_iter=1, file_output_iter=1, time_integration=<factory>, num_prognostic_variables=None, space_discretization=<factory>)[source]#
Bases:
SimulationModelConfigsConfiguration dataclass for the
TimeDependentModelbase class.- Parameters:
verbose (bool) – a boolean flag to control verbosity of the object.
debug (bool) – a boolean flag that enables adding extra functionality in a debug mode
output_dir (str | Path) – the base directory where the output files will be saved.
model_name (str | None) – name of the model. Default is None.
screen_output_iter (int) – iteration interval for screen output. Default is 1. Note that this should be a positive integer to enforce proper effect.
file_out_iter – iteration interval for file output. Default is 1. Note that this should be a positive integer to enforce proper effect.
time_integration (dict | None) –
dictionary holding time integration configurations:
’scheme’: string specifying the time integration scheme (e.g., ‘RK4’, ‘RK45’, ‘BDF’, etc.). Default is None.
’stepsize’: float specifying the time integration stepsize. Default is None.
’adaptive’: bool specifying whether the time integration is adaptive. Default is False.
num_prognostic_variables (int | None) – number of prognostic variables in the model. Default is None. Must be a positive integer if not None.
space_discretization (dict | None) –
dictionary holding space discretization configurations. Contains:
’scheme’: string specifying the space discretization scheme (e.g., ‘FD’, ‘FE’, ‘BE’, etc.). Default is None.
- __init__(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', model_name=None, screen_output_iter=1, file_output_iter=1, time_integration=<factory>, num_prognostic_variables=None, space_discretization=<factory>)#
- class TimeDependentModel(configs=None)[source]#
Bases:
SimulationModelAbstract base class for time-dependent (dynamical) simulation models.
Models inheriting this class represent dynamical systems that evolve in time. They must implement methods for creating state vectors, testing state validity, and integrating the state forward in time over a given time span.
Concrete implementations include Lorenz-63/96, advection-diffusion, and Bateman-Burgers models.
The implementation in classes inheriting this base class MUST carry out all essential tasks (marked with abstractmethod decorators) which in turn should be provided by the dynamical model.
The structure is similar to DATeS base class for simulation models.
- Parameters:
configs (dict | TimeDependentModelConfigs | None) – (optional) configurations for the model.
- validate_configurations(configs, raise_for_invalid=True)[source]#
Validation stage for the the passed configs.
- Parameters:
configs (dict | TimeDependentModelConfigs) – configurations to validate. If a TimeDependentModelConfigs object is passed, validation is performed on the entire set of configurations. However, if a dictionary is passed, validation is performed only on the configurations corresponding to the keys in the dictionary.
- Raises:
PyOEDConfigsValidationError – if the configurations are invalid and raise_for_invalid is set to True.
AttributeError – if any (or a group) of the configurations does not exist in the model configurations
ToyLinearTimeIndependentConfigs.
- abstractmethod state_vector(t=None, init_val=0, **kwargs)[source]#
Create an instance of model state vector
- Parameters:
t – (optional) time assigned to the state vector (for time dependent models); None otherwise
- abstractmethod is_state_vector(state, **kwargs)[source]#
Test whether the passed state vector is valid or not
- abstractmethod integrate_state(state, tspan, checkpoints, *argv, **kwargs)[source]#
- Simulate/integrate the mdoel starting from the initial state over the passed checkpoints.
If state is assigned a time t, it is replaced with the first entry of checkpoints.
- Parameters:
state – data structure holding the initial model state
tspan – (t0, tf) iterable with two entries specifying of the time integration window
checkpoints – times at which to store the computed solution, must be sorted and lie within tspan. If None (default), use points selected by the solver [t0, t1, …, tf].
- Returns:
timespan and a trajectory (checkpointed solution): - the timespan is an iterable (e.g., a list) hodlding timepoints at which state is propagated, - the trajectory is an iterable (e.g., a list) holding the model trajectory with entries corresponding to the simulated model state at entries of checkpoints starting from checkpoints[0] and ending at checkpoints[-1].
- Jacobian_T_matvec(state, eval_at_t, eval_at)[source]#
- Evaluate and return the product of the Jacobian (of the right-hand-side) of
the model (TLM) transposed, by a model state.
- Parameters:
state – state to multiply the Jacobian by
eval_at_t – time at which the Jacobian is evaluated
eval_at – state around which the Jacobian is evaluated
- class ErrorModelConfigs(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', design=True)[source]#
Bases:
PyOEDConfigsBase configuration class for error models
- Parameters:
verbose (bool) – a boolean flag to control verbosity of the object.
debug (bool) – a boolean flag that enables adding extra functionality in a debug mode
output_dir (str | Path) – the base directory where the output files will be saved.
design (None | bool | Sequence[bool] | ndarray) –
an experimental design to define active/inactive entries of the random variable (mean, variance/covariance matrix).
Note
If the design is None, it is set to all ones; that is everything is observed (default)
If the design is a binary vector ( or int dtype attributes with 0/1 entries) the mean, the covariance, and all random vectors are projected onto the space identified by the 1/True entries.
- __init__(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', design=True)#
- class ErrorModel(configs=None)[source]#
Bases:
PyOEDObjectAbstract base class for error/noise models.
An error model represents the observation noise distribution used in data assimilation and optimal experimental design. It provides methods for sampling noise, evaluating the probability density function (PDF) and its gradient, and computing log-densities.
Subclasses must implement
generate_noise()andsample(). Optionally, subclasses can implementpdf(),pdf_gradient(),log_density(), andlog_density_gradient()for use in variational methods.The error model supports an experimental design vector that activates/deactivates individual observation components, and may support relaxed (non-binary) designs depending on the implementation.
- Parameters:
configs – configuration object or dictionary. See
ErrorModelConfigs.
- validate_configurations(configs, raise_for_invalid=True)[source]#
Each simulation model MUST implement it’s own function that validates its own configurations. If the validation is self contained (validates all configuations), then that’s it. However, one can just validate the configurations of of the immediate class and call super to validate configurations associated with the parent class.
If one does not wish to do any validation (we strongly advise against that), simply add the signature of this function to the model class.
Note
The purpose of this method is to make sure that the settings in the configurations object self._CONFIGURATIONS are of the right type/values and are conformable with each other. This function is called upon instantiation of the object, and each time a configuration value is updated. Thus, this function need to be inexpensive and should not do heavy computations.
- Parameters:
configs (dict | PyOEDConfigs) – configurations to validate. If a PyOEDConfigs object is passed, validation is performed on the entire set of configurations. However, if a dictionary is passed, validation is performed only on the configurations corresponding to the keys in the dictionary.
- Raises:
PyOEDConfigsValidationError – if the configurations are invalid and raise_for_invalid is set to True.
AttributeError – if any (or a group) of the configurations does not exist in the model configurations
PyOEDConfigs.
- update_configurations(**kwargs)[source]#
Take any set of keyword arguments, and lookup each in the configurations, and update as nessesary/possible/valid
- Raises:
TypeError – if any of the passed keys in kwargs is invalid/unrecognized
- update_design(design)[source]#
Update the experimental design.
- Parameters:
design – the design to be used update model’s configurations
- Raises:
PyOEDConfigsValidationError – if the passed design is of invalid type/shape/size
- pdf(*args, **kwargs)[source]#
Evaluate the value of the density function (normalized or up to a fixed scaling constant) at the passed state/vector.
- Parameters:
args – positional arguments (implementation-specific, typically a state vector).
kwargs – keyword arguments (implementation-specific).
- Returns:
the PDF value at the given point.
- Return type:
- pdf_gradient(*args, **kwargs)[source]#
Evaluate the gradient of the density function at the passed state/vector.
- Parameters:
args – positional arguments (implementation-specific, typically a state vector).
kwargs – keyword arguments (implementation-specific).
- Returns:
the gradient of the PDF at the given point.
- Return type:
- log_density(*args, **kwargs)[source]#
Evaluate the logarithm of the density function at the passed state
x.- Parameters:
args – positional arguments (implementation-specific, typically a state vector).
kwargs – keyword arguments (implementation-specific).
- Returns:
the log-density value at the given point.
- Return type:
- log_density_gradient(*args, **kwargs)[source]#
Evaluate the gradient of the logarithm of the density function at the passed state.
- Parameters:
args – positional arguments (implementation-specific, typically a state vector).
kwargs – keyword arguments (implementation-specific).
- Returns:
the gradient of the log-density at the given point.
- Return type:
- copy()[source]#
Return a shallow copy of this observation operator by creating a new instance of this object with registered the configurations. NO DEEP COPY of the objects in the configurations dictionary is made.
Note
One can create a deepcopy method but this will be generally problematic as the objects in the configurations dictionary might not be serializable, and we have to deel with errors as well. This is handleded in some way by the parameters of the
pyoed.configs.PyOEDCondfigs.asdict()used here.
- abstractmethod generate_noise()[source]#
Generate a random noise vector sampled from the underlying distribution.
The noise vector is typically zero-mean (pure noise without a drift term). Concrete implementations use their own random number generator and must return a 1-D
numpy.ndarray.The length of the returned vector depends on the projection setting:
active_sizewhenproject_onto_active_design_spaceisTrue(default).sizewhen the full observation space is used.
- Returns:
a randomly sampled noise vector.
- Return type:
- abstractmethod sample()[source]#
Sample a random vector from the full distribution (mean + noise).
Unlike
generate_noise(), which returns additive noise centered at zero, this method samples a complete random vector from the underlying distribution, i.e. it includes any non-zero mean.The length of the returned vector depends on the projection setting:
active_sizewhenproject_onto_active_design_spaceisTrue(default).sizewhen the full observation space is used.
- Returns:
a randomly sampled vector from the distribution.
- Return type:
- property supports_relaxation#
Flag to check whether the model supports relaxation or not
- property project_onto_active_design_space#
All operations assume a projection onto the active subspace or not. This means that any zero entry of the design vector is ommitted.
- property design#
Get a COPY of the design vector
- abstract property size#
Dimension of the underlying probability distribution
- property active#
Flags (boolean 1d array-like) corresponding to active/inactive dimensions according to the experimental design. This vector is of exactly the same size as the observation vector (ignoring the experimental design).
Warning
Because the return type is bool, this vector can be used to slice the vector. To avoid any confustion, however, we decided to add a more clear property active_indexes which returns the indexes of the variable/vector corresponding to non-zero design values. Thus, one should always resort to using active_indexes in case extracting active entries in the variable (according to the design) is required.
- property active_indexes#
An array of the indexes of the observation vector that is active (associated with non-zero design) according to the experimental design.
- abstract property active_size#
Dimension of the probability distribution projected onto the design space (only active/nonzero entries of the design)
- property random_seed#
Registered random seed (if available in configurations)
- design_manager(design)[source]#
Creat an return a context manager that enables updating the design to the passed value, execute needed code and then reset the design to the original value.
Assuming obj is this object, and val is the design value one wants to run the method obj.run_code(), the following code can be used
with obj.design_manager(val) as mngr: mngr.run_code()
- Returns:
a reference to self that enables calling any function under self and then automatically, the design is reset.
- property is_time_dependent#
By default all error models are not time dependent. If a model is time dependent they must return True, otherwise no time will be passed to their methods
Warning
This property probably should be replaced with a dynamic property where a test is carried out to assure whether a model is time-dependent or not.
- class TimeDependentErrorModelConfigs(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', design=None, base_error_model=None, base_error_model_configs=None)[source]#
Bases:
ErrorModelConfigsConfiguration settings for
TimeDependentErrorModel.Note
This implementation only allows time-dependent experimental design, but does not allow time-dependent moments (e.g., mean/variance) since the model is agnostic to those at this point. We can consider setters/getters __setattr__ __getattr__ but this will complicate things at the beginning. This will evolve and decisions will be made!
- Parameters:
verbose (bool) – a boolean flag to control verbosity of the object.
debug (bool) – a boolean flag that enables adding extra functionality in a debug mode
output_dir (str | Path) – the base directory where the output files will be saved.
design (dict | None) – a time-dependent experimental design to define active/inactive entries as a function of time. This is a dictionary with keys set to checkpoints at which the corresponding value is the experimental design is passed to the underlying base_error_model for evaluation.
base_error_model (ErrorModel | Type[ErrorModel] | None) –
the error model used for all time points:
An error model instance (object that inherits :py:class`ErrorModel`). In this case, the error model is registered as is and is updated with the passed configurations if available.
The class (subclass of
ErrorModel) to be used to instantiate the error model.
base_error_model_config – the configurations of the base error model. If not passed, default configurations are used.
- base_error_model: ErrorModel | Type[ErrorModel] | None#
- base_error_model_configs: ErrorModelConfigs | Type[ErrorModelConfigs] | dict | None#
- __init__(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', design=None, base_error_model=None, base_error_model_configs=None)#
- class TimeDependentErrorModel(configs=None)[source]#
Bases:
ErrorModelThis class provides a general implementation for time-dependent error models.
Note
By time-dependence we initially mean that the design changes over time. We can extend that to the case where the moments of the distribution, or even the error models themselves vary over time. In this case, we can replace self.configurations.design with something like self._DATA begin a dictionary indexed by time indexes with values set to the corresponding error model. This, however, will be a bit more expensive, and will be thought over time.
Note
The implementation here is agnostic to the specific choice of the underlying error models, and only allows its configurations (specifically the design) to vary over time.
- validate_configurations(configs, raise_for_invalid=True)[source]#
Validate passed configurations.
- Parameters:
configs (dict | TimeDependentErrorModelConfigs) – configurations to validate. If a
TimeDependentErrorModelConfigsobject is passed, validation is performed on the entire set of configurations. However, if a dictionary is passed, validation is performed only on the configurations corresponding to the keys in the dictionary.
- update_configurations(**kwargs)[source]#
Take any set of keyword arguments, and lookup each in the configurations, and update as necessary/possible/valid
- Raises:
PyOEDConfigsValidationError – if any of the passed keys in kwargs is invalid/unrecognized
- update_design(design)[source]#
Update the experimental design in the configurations object. The design must be None, or a dictionary containing keys as valid times (non-negative floats), and values holding valid experimental designs those can be associated with the underlying base error model. If the base error model is registered, the designs in the passed dictionary are validated against it.
- Parameters:
design – the design to be used update error model’s configurations
- Raises:
PyOEDConfigsValidationError – if the passed design is of invalid type/shape/size
- update_base_error_model(base_error_model, base_error_model_configs=None)[source]#
Update the existing base error model with optional configurations. The model is tested against configured design.
- Parameters:
base_error_model (ErrorModel | Type[ErrorModel]) –
the error model used for all time points:
An error model instance (object that inherits :py:class`ErrorModel`). In this case, the error model is registered as is and is updated with the passed configurations if available.
The class (subclass of
ErrorModel) to be used to instantiate the error model.
base_error_model_configs (ErrorModelConfigs | Type[ErrorModelConfigs] | None) – the configurations of the base error model. If not passed, default configurations are used.
- register_time(t, design, overwrite=False)[source]#
Register a design at a time instance t.
- Parameters:
t – the time at which the design is registered.
design – the design to register at the time t.
overwrite – if the time t exists, overwrite the corresponding/current design. If False and the time exists, a ValueError exception is raised
- Raises:
ValueError – if the time t is registered and overwrite is False.
- check_registered_time(t)[source]#
Check if the passed time is registered or not. If available, return the registered time, otherwise raise
ValueError.- Parameters:
t – the time to check if registered or not.
- Raises:
ValueError – if the time t is not registered.
- generate_noise(t)[source]#
Generate a random noise vector sampled from the underlying base error model
- property size#
Dimension of the probability distribution associated with the underlying base model
- property design#
The experimental design dictionary
- property checkpoints#
The times associated with the registered experimental design. The checkpoints are ordered in ascending order.
- property base_error_model#
Reference to the configured base error model
- property time_precision#
The number of digits used for rounding to maintain predefined precision in time comparisons. This depends on the class variable _TIME_EPS which defines the floating-point accuracy.
Note
The sole purpose of this attribute is to apply np.round(t, time_precision) so that we can properly store/lookup keys in the time-dependent objects (design and checkpoints).
- property active#
Return a dictionary with keys identical to self.design, with values obtained by finding active entries (as bool )in the designs associated with each of the checkpoints.
- property active_size#
Dimension of the probability distribution projected onto the design space (only active/nonzero entries of the design) for all time points.
- property active_indexes#
A dictionary holding arrays (for each registered time) of the indexes of the observation vector that is active (associated with non-zero design) according to the experimental design.
- property random_seed#
Registered random seed (if available in configurations)
- property is_time_dependent#
This model is time-dependent. This property will allow working with these models in the core modules without creating circular imports
- property project_onto_active_design_space#
All operations assume a projection onto the active subspace or not. This operator follows the strategy in the underlying base error model
Base classes for observation operators.
An observation operator transforms a model state vector into an observation vector.
This module provides the abstract interfaces for both time-independent
(ObservationOperator) and time-dependent
(TimeDependentObservationOperator) observation operators, with
built-in support for experimental design (binary sensor activation/deactivation).
- class ObservationOperatorConfigs(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', design=None)[source]#
Bases:
PyOEDConfigsBase configuration class for observation operators.
- Parameters:
verbose (bool) – a boolean flag to control verbosity of the object.
debug (bool) – a boolean flag that enables adding extra functionality in a debug mode
output_dir (str | Path) – the base directory where the output files will be saved.
design (None | bool | Sequence[bool] | ndarray[bool]) –
an experimental design to define active/inactive entries of the observation vector.
Note
An experimental design here is always binary (on/off) indicating which entries of the design space are active/inactive. This corresponds to turning on/off some of the observational sensors. Thus, if the design is None, it is set to all ones; that is everything is observed
- __init__(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', design=None)#
- class ObservationOperator(configs=None)[source]#
Bases:
PyOEDObjectAbstract base class for observation operators.
An observation operator transforms a model state vector into an observation vector. It encapsulates the mapping \(\mathcal{H}: \mathbb{R}^{n_x} \to \mathbb{R}^{n_y}\) where \(n_x\) is the state dimension and \(n_y\) is the observation dimension.
The operator supports an experimental design (binary vector) that activates/deactivates individual observation components, effectively projecting onto a subspace of the full observation space.
- Parameters:
configs (ObservationOperatorConfigs | dict | None) – configuration object or dictionary. See
ObservationOperatorConfigs.
- validate_configurations(configs, raise_for_invalid=True)[source]#
Check the passed configuratios and make sure they are conformable with each other, and with current configurations once combined. This guarantees that any key-value pair passed in configs can be properly used
- Parameters:
- Returns:
boolflag indicating whether passed coinfigurations dictionary is valid or not- Raises:
see the parameter raise_for_invalid
- Return type:
- observation_vector(init_val=0.0)[source]#
Create an observation vector
- Parameters:
init_val (float) – (optional) value assigned to entries of the state vector upon initialization
- Returns:
an observation vector identical to state vectors generated by the underlying model
Note
This method creates an observation vector with the size equal to either the full observation vector (ignoring the design) or with only number of entries equal to the number of active entries in the design. This is determined based on how PyOED is configured globally through
pyoed.SETTINGS.PROJECT_ONTO_ACTIVE_DESIGN_SPACEwhich is referenced by the underlying methodproject_onto_active_design_space().
- is_observation_vector(observation, ignore_design=False)[source]#
Test whether the passed observation vector is valid or not. The observation is valid if it is a 1d numpy array (when squeezed) with size equal to the underlying observation size. The observation size is determined based on whether PyOED is configured to project observations on the design space (default) or not as given by pyoed.SETTINGS.PROJECT_ONTO_ACTIVE_DESIGN_SPACE which is referenced by the underlying method
project_onto_active_design_space(). This is ignored when ignore_design is set to True.Note
- The flag ignore_design is added for two main purposes:
greater flexibility,
validation of observations when data assimilation is required. Specifically, in DA, we need the full observation vector always as the design can change and we still want to solve the inverse problem with different designs. This can be done either be re-registering the observations with different designs, or keeping the full observation and only extracting the relevant parts.
- Parameters:
observation – an observation vector
ignore_design – if True the experimental design is ignored while validating the passed observation. In this case, the vector is expected to be of size equal to the dimension of the observation space when the design is all ones.
- update_design(design)[source]#
Update the experimental design.
- Parameters:
design – the design to be used update model’s configurations
- Raises:
PyOEDConfigsValidationError – if the passed design is of invalid type/shape/size
- validate_observation_vector(observation, ignore_design=False)[source]#
Validate that
observationis a conformable observation vector, raising on failure.This is a strict variant of
is_observation_vector(): it raises aTypeErrorwhen the observation is invalid instead of returningFalse. Used internally before computations that require a valid observation.- Parameters:
observation (numpy.ndarray) – the candidate observation array to validate.
ignore_design (bool) – when
True, the check ignores the experimental design and requires the observation to have size equal to the full observation space (all sensors active). Defaults tonot pyoed.SETTINGS.PROJECT_ONTO_ACTIVE_DESIGN_SPACE.
- Raises:
TypeError – if
observationis not a valid 1-D array of the expected size.
- update_configurations(**kwargs)[source]#
Take any set of keyword arguments, and lookup each in the configurations, and update as nessesary/possible/valid
- Raises:
TypeError – if any of the passed keys in kwargs is invalid/unrecognized
- Jacobian_T_matvec(observation, eval_at, **kwargs)[source]#
Compute the adjoint (transpose-Jacobian) action on an observation vector.
Evaluates \(\mathbf{H}(\text{eval\_at})^T \, \text{observation}\), where \(\mathbf{H}\) is the Jacobian (tangent-linear model, TLM) of this observation operator evaluated at
eval_at.This method is required by gradient-based variational assimilation algorithms (e.g., 4D-Var) to backpropagate observation-space residuals into the state space. It is not needed by ensemble-based methods (e.g., EnKF).
- Parameters:
observation (numpy.ndarray) – a vector in observation space (size
observation_size), typically the innovation or adjoint forcing.eval_at (numpy.ndarray) – the model state at which the Jacobian is evaluated (size equal to the state dimension \(n_x\)).
kwargs – additional keyword arguments (implementation-specific).
- Returns:
the adjoint action vector in state space (size \(n_x\)).
- Return type:
- Raises:
NotImplementedError – always, unless overridden by a concrete subclass.
- copy()[source]#
Return a shallow copy of this observation operator by creating a new instance of this object with registered the configurations. NO DEEP COPY of the objects in the configurations dictionary is made.
Note
One can create a deepcopy method but this will be generally problematic as the objects in the configurations dictionary might not be serializable, and we have to deel with errors as well. This is handleded in some way by the parameters of the
pyoed.configs.PyOEDCondfigs.asdict()used here.
- abstractmethod apply(state, **kwargs)[source]#
Apply the observation operator to a model state vector.
Maps the state vector from the model space (\(\mathbb{R}^{n_x}\)) to the observation space (\(\mathbb{R}^{n_y}\)). When the experimental design is active and
project_onto_active_design_spaceisTrue, the returned vector contains only the components corresponding to active (non-zero design) sensors.- Parameters:
state (numpy.ndarray) – the model state vector of size \(n_x\).
kwargs – additional keyword arguments (implementation-specific).
- Returns:
the observation vector of size
observation_size(eitheractive_sizeor the full \(n_y\) depending onproject_onto_active_design_space).- Return type:
- property supports_relaxation#
Flag to check whether the observation operator supports relaxation or not. This should always be False. It is only added for consistency with the base error model.
- property active#
Flags (boolean 1d array-like) corresponding to active/inactive dimensions according to the experimental design. This vector is of exactly the same size as the observation vector (ignoring the experimental design).
Warning
Because the return type is bool, this vector can be used to slice the vector. To avoid any confustion, however, we decided to add a more clear property active_indexes which returns the indexes of the variable/vector corresponding to non-zero design values. Thus, one should always resort to using active_indexes in case extracting active entries in the variable (according to the design) is required.
- property active_indexes#
An array of the indexes of the observation vector that is active (associated with non-zero design) according to the experimental design.
- property design#
Get a COPY of the design vector
- abstract property shape#
The shape of the observation operator discarding any experimental design. An observation operator maps the model state onto the observation space. Thus, the shape of the observation operator here is defined as a tuple on the form \((ny, nx)\) where \(ny\) is the size of the observation vector, and nx is the size of the state vector.
Note
The shape of the operator is INDEPENDENT from the design and defines the codomain of the operator \(ny\) assuming all entries of the design are active. To get the shape of the operator corresponding to the active entries of the design only use
active_shape().
- property active_shape#
The shape of the observation operator discarding any experimental design. An observation operator maps the model state onto the observation space. Thus, the shape of the observation operator here is defined as a tuple on the form \((ny, nx)\) where \(ny\) is the size of the observation vector, and nx is the size of the state vector.
Note
The shape of the operator is INDEPENDENT from the design and defines the codomain of the operator \(ny\) assuming all entries of the design are active. To get the shape of the operator corresponding to the active entries of the design only use
active_shape().
- property project_onto_active_design_space#
All operations assume a projection onto the active subspace or not. This means that any zero entry of the design vector is ommitted.
- design_manager(design)[source]#
Creat an return a context manager that enables updating the design to the passed value, execute needed code and then reset the design to the original value.
Assuming obj is this object, and val is the design value one wants to run the method obj.run_code(), the following code can be used
with obj.design_manager(val) as mngr: mngr.run_code()
- Returns:
a reference to self that enables calling any function under self and then automatically, the design is reset.
- property is_time_dependent#
By default all observation operator not time dependent. If a model is time dependent they must return True, otherwise no time will be passed to their methods
Warning
This property probably should be replaced with a dynamic property where a test is carried out to assure whether a model is time-dependent or not.
- class TimeDependentObservationOperatorConfigs(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', design=None, base_observation_operator=None, base_observation_operator_configs=None)[source]#
Bases:
ObservationOperatorConfigsConfiguration settings for
TimeDependentObservationOperator.Note
This implementation only allows time-dependent experimental design, but does not allow time-dependent moments (e.g., mean/variance) since the model is agnostic to those at this point. We can consider setters/getters __setattr__ __getattr__ but this will complicate things at the beginning. This will evolve and decisions will be made!
- Parameters:
verbose (bool) – a boolean flag to control verbosity of the object.
debug (bool) – a boolean flag that enables adding extra functionality in a debug mode
output_dir (str | Path) – the base directory where the output files will be saved.
design (dict | None) – a time-dependent experimental design to define active/inactive entries as a function of time. This is a dictionary with keys set to checkpoints at which the corresponding value is the experimental design is passed to the underlying base_observation_operator for evaluation.
base_observation_operator (ObservationOperator | Type[ObservationOperator] | None) –
the observation operator used for all time points:
An observation operator instance (object that inherits
ObservationOperator). In this case, the observation operator is registered as is and is updated with the passed configurations if available.The class (subclass of
ObservationOperator) to be used to instantiate the observation operator.
base_observation_operator_config – the configurations of the base observation operator. If not passed, default configurations are used.
- base_observation_operator: ObservationOperator | Type[ObservationOperator] | None#
- base_observation_operator_configs: ObservationOperatorConfigs | Type[ObservationOperatorConfigs] | dict | None#
- __init__(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', design=None, base_observation_operator=None, base_observation_operator_configs=None)#
- class TimeDependentObservationOperator(configs=None)[source]#
Bases:
ObservationOperatorThis class provides a general implementation for time-dependent observation operators.
Note
By time-dependence we initially mean that the design changes over time. We can extend that to the case where the moments of the distribution, or even the observation operators themselves vary over time. In this case, we can replace self.configurations.design with something like self._DATA begin a dictionary indexed by time indexes with values set to the corresponding observation operator. This, however, will be a bit more expensive, and will be thought over time.
Note
The implementation here is agnostic to the specific choice of the underlying observation operators, and only allows its configurations (specifically the design) to vary over time.
- validate_configurations(configs, raise_for_invalid=True)[source]#
Validate passed configurations.
- Parameters:
configs (dict | TimeDependentObservationOperatorConfigs) – configurations to validate. If a
TimeDependentObservationOperatorConfigsobject is passed, validation is performed on the entire set of configurations. However, if a dictionary is passed, validation is performed only on the configurations corresponding to the keys in the dictionary.
- update_configurations(**kwargs)[source]#
Take any set of keyword arguments, and lookup each in the configurations, and update as necessary/possible/valid
- Raises:
PyOEDConfigsValidationError – if any of the passed keys in kwargs is invalid/unrecognized
- update_design(design)[source]#
Update the experimental design in the configurations object. The design must be None, or a dictionary containing keys as valid times (non-negative floats), and values holding valid experimental designs those can be associated with the underlying base observation operator. If the base observation operator is registered, the designs in the passed dictionary are validated against it.
- Parameters:
design – the design to be used update observation operator’s configurations
- Raises:
PyOEDConfigsValidationError – if the passed design is of invalid type/shape/size
- update_base_observation_operator(base_observation_operator, base_observation_operator_configs=None)[source]#
Update the existing base observation operator with optional configurations. The model is tested against configured design.
- Parameters:
base_observation_operator (ObservationOperator | Type[ObservationOperator]) –
the observation operator used for all time points:
An observation operator instance (object that inherits :py:class`ObservationOperator`). In this case, the observation operator is registered as is and is updated with the passed configurations if available.
The class (subclass of
ObservationOperator) to be used to instantiate the observation operator.
base_observation_operator_configs (ObservationOperatorConfigs | Type[ObservationOperatorConfigs] | None) – the configurations of the base observation operator. If not passed, default configurations are used.
- register_time(t, design, overwrite=False)[source]#
Register a design at a time instance t.
- Parameters:
t – the time at which the design is registered.
design – the design to register at the time t.
overwrite – if the time t exists, overwrite the corresponding/current design. If False and the time exists, a ValueError exception is raised
- Raises:
ValueError – if the time t is registered and overwrite is False.
- check_registered_time(t)[source]#
Check if the passed time is registered or not. If available, return the registered time, otherwise raise
ValueError.- Parameters:
t – the time to check if registered or not.
- Raises:
ValueError – if the time t is not registered.
- apply(state, eval_at_t)[source]#
Given a state vector (laid out as defined by the model for which observation operator is defined), evaluate the observation vector.
- Parameters:
state – the model state vector to observe.
eval_at_t – the time at which the observation operator is evaluated.
- Returns:
an observation vector/instance
- Raises:
ValueError – if the time t is not registered.
- observation_vector(t, init_val=0.0)[source]#
Create an observation vector by employing the observation oeprator settings at time instance t.
Note
This method creates an observation vector with the size equal to either the full observation vector (ignoring the design) or with only number of entries equal to the number of active entries in the design. This is determined based on how PyOED is configured globally through
pyoed.SETTINGS.PROJECT_ONTO_ACTIVE_DESIGN_SPACEwhich is referenced by the underlying methodproject_onto_active_design_space().- Parameters:
- Returns:
an observation vector identical to state vectors generated by the underlying model
- Raises:
ValueError – if the time t is not registered.
- is_observation_vector(observation, t, ignore_design=False)[source]#
Test whether the passed observation vector is valid or not given the settings of the observation operator at time instance t. The observation is valid if it is a 1d numpy array (when squeezed) with size equal to the underlying observation size. The observation size is determined based on whether PyOED is configured to project observations on the design space (default) or not as given by pyoed.SETTINGS.PROJECT_ONTO_ACTIVE_DESIGN_SPACE which is referenced by the underlying method
project_onto_active_design_space(). This is ignored when ignore_design is set to True.Note
The flag ignore_design is added for two main purposes:
greater flexibility,
validation of observations when data assimilation is required. Specifically, in DA, we need the full observation vector always as the design can change and we still want to solve the inverse problem with different designs. This can be done either be re-registering the observations with different designs, or keeping the full observation and only extracting the relevant parts.
- Parameters:
observation – an observation vector
t (float) – observation time.
ignore_design – if True the experimental design is ignored while validating the passed observation. In this case, the vector is expected to be of size equal to the dimension of the observation space when the design is all ones.
- Raises:
ValueError – if the time t is not registered.
- Jacobian_T_matvec(observation, eval_at_t, eval_at=None)[source]#
Multiply (Matrix-free matrix-vector product) of transpose of the Jacobian/tangent-linear (TLM) of the observation operator by the passed observation; The Jacobian is evaluated (for nonlinear operators) at eval_at
- Parameters:
observation – observation instance/vector
eval_at – state around which observation operator is linearized.
eval_at_t – the time at which the observation operator is evaluated.
- Returns:
result of multiplying the transpose of derivative of the observation operator by the passed observation
- Raises:
TypeError – if the passed observation is invalid (type/shape/etc.)
- validate_observation_vector(observation, t, ignore_design=False)[source]#
Check if the observation vector is valid. If not, raise a TypeError. This is mostly for internl validation.
- Raises:
ValueError – if the time t is not registered.
- property design#
The experimental design dictionary
- property checkpoints#
The times associated with the registered experimental design. The checkpoints are ordered in ascending order.
- property base_observation_operator#
Reference to the configured base observation operator
- property time_precision#
The number of digits used for rounding to maintain predefined precision in time comparisons. This depends on the class variable _TIME_EPS which defines the floating-point accuracy.
Note
The sole purpose of this attribute is to apply np.round(t, time_precision) so that we can properly store/lookup keys in the time-dependent objects (design and checkpoints).
- property observation_size: dict#
Return a dictionary hodling sizes of observation vectors for each registered time.
- property shape#
Dimension of the observation space associated with the underlying base model
- property active#
Return a dictionary with keys identical to self.design, with values obtained by finding active entries (as bool )in the designs associated with each of the checkpoints.
- property active_shape#
The shape of the observation operator discarding any experimental design. An observation operator maps the model state onto the observation space. Thus, the shape of the observation operator here is defined as a tuple on the form \((ny, nx)\) where \(ny\) is the size of the observation vector, and nx is the size of the state vector.
Note
The shape of the operator is INDEPENDENT from the design and defines the codomain of the operator \(ny\) assuming all entries of the design are active. To get the shape of the operator corresponding to the active entries of the design only use
active_shape().
- property active_indexes#
A dictionary holding arrays (for each registered time) of the indexes of the observation vector that is active (associated with non-zero design) according to the experimental design.
- property random_seed#
Registered random seed (if available in configurations)
- property is_time_dependent#
This model is time-dependent. This property will allow working with these models in the core modules without creating circular imports