Probabilistic (Stochastic) Binary Optimization#
|
Configuration dataclass for the |
|
Implementation of the REINFORCE algorithm for binary optimization (maximization or minimization). |
|
Container for results/data generated by the |
|
Stochastic Average Approximation (SAA) algorithm for binary optimization |
A module providing access to multiple Binary optimization routines.
- class BinaryReinforceOptimizerConfigs(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', size=None, name='BinaryReinforceOptimizer: Binary Optimization via Stochastic Learning', screen_output_iter=1, file_output_iter=100, fun=None, initial_policy=None, maximize=True, def_stepsize=0.1, decay_step=False, Newton_step=False, stochastic_gradient_sample_size=32, baseline='optimal', optimal_baseline_batch_size=32, optimal_sample_size=5, objective_value_tracker=None, maxiter=10000, pgtol=1e-12, monitor_iterations=True, random_seed=None, antithetic=True, uniform_random_sample_size=0, num_resets=0)[source]#
Bases:
OptimizerConfigs
Configuration dataclass for the
BinaryReinforceOptimizer
.- Parameters:
verbose (bool) – a boolean flag to control verbosity of the object.
debug (bool) – a boolean flag that enables adding extra functionlity in a debug mode
output_dir (str | Path) – the base directory where the output files will be saved.
name (str) – name of the optimizer. Default is None.
screen_output_iter (int) – iteration interval for screen output. Default is 1. Note that this should be a positive integer to enforce proper effect.
file_output_iter (int) – iteration interval for file output. Default is 1. Note that this should be a positive integer to enforce proper effect.
size (int | None) – dimension of the target/solution variable. Must be a positive integer
fun (Callable | None) – the objective function to be optimization. For a given binary state/design of size size it returns a scalar.
initial_policy (ndarray | Iterable | None) – the initial probabilistic policy parameter (activation/success probability of the Bernoulli distribution)
maximize (bool) – if True the objective is to be maximized, otherwise minimization is assumed
def_stepsize (float) – default step size (learning rate) in the direction (or the opposite) of the gradient. Default is 1.
decay_step (bool) – if True reduce the learning rate (step size) as the algorithm proceeds
Newton_step (bool) – if True use second-order derivative to define the update direction
stochastic_gradient_sample_size (int) – size of the sample to generate (from current policy at each iteration) to estimate (using MC) the policy gradient
baseline (str | None) –
name/type of the baseline used to reduce variability of the stochastic estimate of the gradient to use (if any); possible values are None, ‘none’, ‘optimal’, ‘heuristic’. These are explained as follows:
’none’ | None: no baseline is used.
’heuristic’: evaluate an empirical baseline \((J(0)+J(1))/2\) with \(J\) being the passed fun.
’optimal’: a constant baseline is found by approximating the optimal baseline where optimality is defined as minimum variability of the estimator.
optimal_baseline_batch_size (int) – size of the batch/sample to use for optimal baseline estimation; used if
baseline==True
.optimal_sample_size (int) – size of the sample to return from the final/optimal policy upon termination of the optimizer
objective_value_tracker (dict | None) –
a dictionary that holds values of the objective function for sampled binary vectors. This can be helpful if the optimizer was run previously and objective vlaues have been saved. This is used as the initializer of the tracker used while solving the optimization problem.
Note
The objective_value_tracker in the configurations object is not updated as the solver proceeds. It is only used as the initializer of the internal tracker. The results returned from the optimizer contain another copy of the objective_value_tracker with updated sampled binary vectors and corresponding objectives.
uniform_random_sample_size (int) – size of a random sample generated uniformly (0.5 probability) for performance evaluation.
num_resets (None | int) – number of resets during optimization
- fun: Callable | None#
- initial_policy: ndarray | Iterable | None#
- maximize: bool#
- def_stepsize: float#
- decay_step: bool#
- Newton_step: bool#
- stochastic_gradient_sample_size: int#
- baseline: str | None#
- optimal_baseline_batch_size: int#
- optimal_sample_size: int#
- objective_value_tracker: dict | None#
- maxiter: int#
- pgtol: float#
- monitor_iterations: bool#
- random_seed: int | None#
- antithetic: bool#
- uniform_random_sample_size: int#
- num_resets: None | int#
- __init__(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', size=None, name='BinaryReinforceOptimizer: Binary Optimization via Stochastic Learning', screen_output_iter=1, file_output_iter=100, fun=None, initial_policy=None, maximize=True, def_stepsize=0.1, decay_step=False, Newton_step=False, stochastic_gradient_sample_size=32, baseline='optimal', optimal_baseline_batch_size=32, optimal_sample_size=5, objective_value_tracker=None, maxiter=10000, pgtol=1e-12, monitor_iterations=True, random_seed=None, antithetic=True, uniform_random_sample_size=0, num_resets=0)#
- class BinaryReinforceOptimizerResults(*, optimal_policy=<factory>, optimal_policy_sample=<factory>, optimal_policy_sample_objval=<factory>, optimal_design=<factory>, optimal_design_objval=None, optimization_trajectory=<factory>, optimization_trajectory_stoch_objval=<factory>, optimization_trajectory_best_sample=<factory>, optimization_trajectory_best_sample_objval=<factory>, best_along_the_route=<factory>, converged=False, maxiter_reached=False, message=<property object>, objective_value_tracker=<factory>, sampled_design_indexes=<factory>, num_resets=None, reset_at_iterations=<factory>, uniform_random_sample_indexes=None, uniform_random_sample_objval=None, configurations=None)[source]#
Bases:
OptimizerResults
Container for results/data generated by the
BinaryReinforceOptimizer
.- Parameters:
optimal_policy (ndarray) – the parameters of the optimal policy (probability of activation of each entry of the design)
optimal_policy_sample (ndarray) – design sample generated by sampling the optimal policy (Bernoulli samples with probababilities set to the optimal policy parameters optimal_policy_parameters)
optimal_policy_sample_objval (ndarray) – value of the objective function \(J\) evaluated at each design in optimal_sample
optimal_design (ndarray) – the optimal design (binary vector) sampled from the optimal_policy.
optimal_design_objval (float | None) – the value of the objective function evaluated at the above optimal design.
optimization_trajectory (ndarray) – trajectory (consecutive parameters) of the policy parameters generated along the optimization iterations; populated only if monitor_iterations configuration is set to True.
optimization_trajectory_stoch_objval (ndarray) – value (MC estimate) of the stochastic objective function, that is \(E[J]\) along the optimization trajectory; populated only if monitor_iterations configuration is set to True.
optimization_trajectory_best_sample (ndarray) – the best sample point at each iteration. This is the best sampled design at each iteration. For maximization, it is the sampled design with maximum objective value (among the sample)
optimization_trajectory_best_sample_objval (ndarray) – the objective value corresponding to each design in optimization_trajectory_best_sample.
converged (bool) – True if the algorithm converged, False otherwise.
msg – a string message describing the final status of the algorithm
objective_value_tracker (dict | None) – a dictionary containing the value of the objective function evaluated through the course of the algorithm. The dictionary is indexed by the integer indexes of the generated binary states.
sampled_design_indexes (list[list[int]]) – a list (of lists) containing all sampled design indexes for each iteration of the algorithm. You can generate the binary designs themselves by calling
pyoed.utility.math.index_to_binary_state()
on the indices.best_along_the_route (dict) –
a dictionary with two entries/keys (str):
’objval’: this key/value entry holds the best objective value seen during the exploration (before collecting the sample from the optimal policy).
’designs’: a list of binary designs corresponding to the objective value ‘objval’.
num_resets (int | None) – number of resets during optimization (if set in the configurations). if given it must be be a positive integer indicating the number of times the optimizer is reset to initial_policy after convergence. If either 0 or None is given, num_resets is set to ‘0’ thus not resets are carried out
reset_at_iterations (list) – a list holding the iterations after which solver was reset (if valid value of num_resets is set in the configurations).
configurations (None | dict) – a dictionary holding problem configurations. This can be obtained by calling
asdict()
of the configurations object used by the optimizer, e.g.,BinaryReinforceOptimizerConfigs
.
- optimal_policy: ndarray#
- optimal_policy_sample: ndarray#
- optimal_policy_sample_objval: ndarray#
- optimal_design: ndarray#
- optimal_design_objval: float | None#
- optimization_trajectory: ndarray#
- optimization_trajectory_stoch_objval: ndarray#
- optimization_trajectory_best_sample_objval: ndarray#
- best_along_the_route: dict#
- converged: bool#
- maxiter_reached: bool#
- objective_value_tracker: dict | None#
- sampled_design_indexes: list[list[int]]#
- num_resets: int | None#
- reset_at_iterations: list#
- optimization_trajectory_best_sample: ndarray#
- uniform_random_sample_indexes: None | list[int]#
- uniform_random_sample_objval: None | list[float]#
- configurations: None | dict#
- property x: ndarray#
The optimal solution of the optimization problem.
- property success: bool#
Whether or not the optimizer exited successfully.
- message: str#
- property fun: float#
Value of objective function at the optimal solution x.
- property jac: ndarray#
Value of the Jacobian of the objective function at the optimal solution x (if available).
- property hess: object#
Value of the Hessian of the objective function at the optimal solution x (if available).
- property hess_inv: object#
Value of the inverse of Hessian of the objective function at the optimal solution x (if available).
- property nfev: int#
Number of evaluations of the objective functions.
- property njev: int#
Number of evaluations of the the Jacobian of the objective function.
- property nhev: int#
Number of evaluations of the the Hessian of the objective function.
- property nit: int#
Number of iterations performed by the optimizer.
- __init__(*, optimal_policy=<factory>, optimal_policy_sample=<factory>, optimal_policy_sample_objval=<factory>, optimal_design=<factory>, optimal_design_objval=None, optimization_trajectory=<factory>, optimization_trajectory_stoch_objval=<factory>, optimization_trajectory_best_sample=<factory>, optimization_trajectory_best_sample_objval=<factory>, best_along_the_route=<factory>, converged=False, maxiter_reached=False, message=<property object>, objective_value_tracker=<factory>, sampled_design_indexes=<factory>, num_resets=None, reset_at_iterations=<factory>, uniform_random_sample_indexes=None, uniform_random_sample_objval=None, configurations=None)#
- class BinaryReinforceOptimizer(configs=None)[source]#
Bases:
Optimizer
,RandomNumberGenerationMixin
Implementation of the REINFORCE algorithm for binary optimization (maximization or minimization). The algorithm assumes a probabilistic policy following a multivariate Bernoulli distribution. Thus, the policy is parameterized by probability of success (policy_parameters/theta)
Details of the stochastic learning approach to binary optimization can be found in [1].
This is an implementation of the REINFORCE algorithm for stochastic binary optimization. The objective is to solve one of the following optimization problems \(\max_{d} E[J(d)]\) or \(\min_{d} E[J(d)]\) based on the value of maximize. The optimization is assumed to be unconstrained and any soft constraints are already absorbed into fun passed in the configurations dictionary.
- Parameters:
configs (BinaryReinforceOptimizerConfigs | dict | None) – Configurations of the
BinaryReinforceOptimizer
, either as an instance ofBinaryReinforceOptimizerConfigs
, a dictionary, or None. The default is None, in which case, the default configurations provided byBinaryReinforceOptimizerConfigs
are used.
References:
Ahmed Attia, Sven Leyffer, and Todd S. Munson. “Stochastic Learning Approach for Binary Optimization: Application to Bayesian Optimal Design of Experiments.” SIAM Journal on Scientific Computing 44, no. 2 (2022): B395-B427.
- validate_configurations(configs, raise_for_invalid=True)[source]#
Check the passed configuratios and make sure they are conformable with each other, and with current configurations once combined. This guarantees that any key-value pair passed in configs can be properly used
Note
Here only the locally-defined configurations in
BinaryReinforceOptimizerConfigs
are validated. Finally, super classes validators are called.- Parameters:
configs (dict | BinaryReinforceOptimizerConfigs) – full or partial (subset) configurations to be validated
raise_for_invalid (bool) – if True raise
TypeError
for invalid configrations type/key. Default True
- Returns:
flag indicating whether passed configurations dictionary is valid or not
- Raises:
AttributeError – if any (or a group) of the configurations does not exist in the model configurations
BinaryReinforceOptimizerConfigs
.PyOEDConfigsValidationError – if the configurations are invalid and raise_for_invalid is set to True.
- Return type:
bool
- update_configurations(**kwargs)[source]#
Take any set of keyword arguments, and lookup each in the configurations, and update as nessesary/possible/valid
- Raises:
PyOEDConfigsValidationError – if invalid configurations passed
- reset_objective_value_tracker()[source]#
Reset the objective value tracker and set it to the tracker in the configurations.
- clear_objective_value_tracker()[source]#
Remove all entries from the objective value tracker; set it to empty dictionary.
- update_random_number_generators(random_seed)[source]#
Update the underlying random number generator(s) given the passed random seed This updates two random number generators; the one here and the generator associated with the underlying Bernoulli distribution
- objective_function_value(design)[source]#
A function to prevent redundant computations of the objective (utility function) by looking up tracker before evaluation
- Parameters:
design – binary state/design variable to evaluate objective value at
- Returns:
value of the objective/utility function at the passed design
- Raises:
TypeError
if the passed design is not of the right size
- stochastic_policy_objective_value(theta, fun=None, sample=None, sample_size=32, exact=False)[source]#
Evaluate the value (exact expectation or approximate using MC sampling) of the stochastic objective function reformulation of the binary optimization problem.
Here, we define the stochastic objective function \(E(J(x))\) where \(U\) is the deterministic objective fun (utility function) defined in the configurations, and \(x\) is a binary design variable following a multivariate Bernoulli distribution with parameter (success probability) \(\theta\).
If exact is False, this function returns a Monte Carlo (MC) approximation of the stochastic objective (stochastic average approximation). `If `exact is True`, this function evaluates the expectation by enumeration (over all possible designs) and returns the expected value. This option is provided for testing purposes.
- Parameters:
theta (numpy.ndarray|iterable) – the policy parameters (success probability of the underlying Bernoulli distribution)
fun (callable) – the utility/objective function to calculate the stochastic objective. If None, the fun from the configurations dictionary is used. This argument is provided to enable altering the objective while calculating the derivatives (e.g., by adding a baseline) to reduce variance of the estimator(s).
sample (numpy.ndarray) – a sample generated from the passed policy (provided in case samples are generated by other routines). This is discarded if exact is True. Each row is one sample point.
sample_size (int) – number of sample points to generate (if sample is None and exact is False). This is ignored if sample is not None (sample_size is updated to the length of the provided sample in this case ) or exact is True
exact (bool) – whether to evaluate the exact (expected value over all possible binary designs) objective value or a Monte Carlo approximation (stochastic average approximation) of the objective/expectation.
- Returns:
value of the stochastic objective value (exact or approximate)
- Return type:
float
- Raises:
TypeError
if a sample is provided but has wrong number of columns (must be equal to the dimension of the underlying Bernoulli distribution).
- stochastic_policy_objective_gradient(theta, fun=None, sample=None, sample_objvals=None, sample_grad_log_pmf=None, sample_size=32, optimal_baseline=False, apply_projection=False, exact=False)[source]#
Evaluate the gradient (exact expectation gradient or approximate using MC sampling) of the stochastic objective function reformulation of the binary optimization problem. This evaluates the gradient of
stochastic_policy_objective_value()
with respect to the policy parameter theta, that is the success/activate probability.If exact is False, this function returns a Monte Carlo (MC) approximation of the gradient of the stochastic objective (stochastic average approximation). `If `exact is True, this function evaluates the expectation gradient by enumeration (over all possible designs) and returns the gradient of the expected value. This option is provided for testing purposes.
- Parameters:
theta (numpy.ndarray|iterable) – the policy parameters (success probability of the underlying Bernoulli distribution)
fun (callable) – the utility/objective function to calculate the stochastic objective. If None, the fun from the configurations dictionary is used. This argument is provided to enable altering the objective while calculating the derivatives (e.g., by adding a baseline) to reduce variance of the estimator(s).
sample (numpy.ndarray) – a sample generated from the passed policy (provided in case samples are generated by other routines). This is discarded if exact is True. Each row is one sample point.
sample_objvals – The objective value (fun(x) where fun is the registered function to be optimized) for each x in the passed sample.
sample_grad_log_pmf – The gradient of the logarithm of the probability mass function (grad-log-PMF) of the underlying bernoulli (conditional or not) model. function to be optimized) for each x in the passed sample.
sample_size (int) – number of sample points to generate (if sample is None and exact is False). This is ignored if sample is not None (sample_size is updated to the length of the provided sample in this case ) or exact is True
obj_val_tracker (dict) – optional dictionary holding values of the objective function (will be updated in-place if passed). Default is None.
apply_projection (bool) – if True, the gradient is projected (by truncation) onto the box [-1,1]. Default is False.
exact (bool) – whether to evaluate the exact (expected value over all possible binary designs) objective value or a Monte Carlo approximation (stochastic average approximation) of the objective/expectation.
- ..note::
If a valid sample is passed, then sample_size is discarded. In this case, sample_size is set to the number of sample points in sample. Both sample, and sample_size, are of course discarded if exact is True.
- Returns:
gradient of the stochastic objective value (exact or approximate)
- Return type:
numpy.ndarray
- evaluate_pmf(design, theta)[source]#
Evaluate the value of the probability mass function (PMF) for a given sampled design.
- Parameters:
design – binary state/design variable to evaluate objective value at
theta (numpy.ndarray|iterable) – the policy parameters (success probability of the underlying Bernoulli distribution)
- evaluate_grad_log_pmf(design, theta)[source]#
Evaluate the gradient of the log of PMF value for a given sampled design.
- Parameters:
design – binary state/design variable to evaluate objective value at
theta (numpy.ndarray|iterable) – the policy parameters (success probability of the underlying Bernoulli distribution)
- Returns:
the gradient of the log probability associated with the passed design
- evaluate_grad_pmf(design, theta)[source]#
Evaluate the gradient of the PMF value for a given sampled design.
- Parameters:
design – binary state/design variable to evaluate objective value at
theta (numpy.ndarray|iterable) – the policy parameters (success probability of the underlying Bernoulli distribution)
- Returns:
the gradient of the log probability associated with the passed design
- generate_sample(theta, sample_size=1, dtype=<class 'bool'>)[source]#
Generate sample of size sample_size for a given policy parameter (theta).
- Parameters:
theta (numpy.ndarray|iterable) – the policy parameters (success probability of the underlying Bernoulli distribution)
sample_size (int) – number of desings/states to sample from the Bernoulli distribution with with the passed parameter theta
dtype (type) – data type to set for entries of the returned sample array
- Returns:
sample: 2d array of shape
(sample_size x dim)
wheredim
is the size of theta
- optimal_baseline_estimate(theta, fun=None, sample=None, sample_objvals=None, sample_grad_log_pmf=None, batch_size=32)[source]#
Use the passed sample to calculate a sample based estimate of the optimal baseline value.
- Parameters:
theta (numpy.ndarray|iterable) – the policy parameters (success probability of the underlying Bernoulli distribution)
fun (callable) – the utility/objective function to calculate the stochastic objective. If None, the fun from the configurations dictionary is used. This argument is provided to enable altering the objective while calculating the derivatives (e.g., by adding a baseline) to reduce variance of the estimator(s).
sample – Given sample
sample_objvals – The objective value (fun(x) where fun is the registered function to be optimized) for each x in the passed sample.
sample_grad_log_pmf – The gradient of the logarithm of the probability mass function (grad-log-PMF) of the underlying bernoulli (conditional or not) model. function to be optimized) for each x in the passed sample.
batch_size (int) – number of samples to use for the MC approximation. Default is 32.
- Returns:
optimal baseline estimate
- Return type:
float
- bruteforce_objective_values(fun=None, num_active=None, objective_value_tracker=None)[source]#
Evaluate the value of the objective (obj) at all possible binary combinations;
- Returns:
a dictionary of objective values evaluated for each possible binary instance. The dictionary indexed by unique integer indexes.
- Return type:
dict[int,float]
- ..note::
This function should be used ONLY for testing in small-dimensional problems.
- full_Hessian_estimate(theta, fun=None, alpha=1.0, sample=None, sample_size=32)[source]#
Estimate the Hessian matrix of the stochastic objective function \(E[J(d)]\) for a given probability of success theta.
\[(\alpha I + \mathcal{H}) x\]- Parameters:
theta (numpy.ndarray|iterable) – the policy parameters (success probability of the underlying Bernoulli distribution)
alpha (float) – As defined in the equation above. Default is 1.0.
fun (callable) – the utility/objective function to calculate the stochastic objective. If None, the fun from the configurations dictionary is used. This argument is provided to enable altering the objective while calculating the derivatives (e.g., by adding a baseline) to reduce variance of the estimator(s).
sample – samples to use in estimation. If not provided, they’re sampled. Default is None.
sample_size (int) – sample size to use in estimation. Default is 32.
- Returns:
estimate of the Hessian matrix
- Hessian_estimate_vector_prod(theta, x, fun=None, alpha=1.0, sample=None, sample_size=32)[source]#
Compute the product of the Hessian matrix (see
full_Hessian_estimate()
) of the objective function by a given vector x\[(\alpha I + \mathcal{H}) x\]- Parameters:
theta (numpy.ndarray|iterable) – the policy parameters (success probability of the underlying Bernoulli distribution)
x (np.ndarray) – vector to multiply the Hessian with.
alpha (float) – As defined in the equation above. Default is 1.0.
fun (callable) – the utility/objective function to calculate the stochastic objective. If None, the fun from the configurations dictionary is used. This argument is provided to enable altering the objective while calculating the derivatives (e.g., by adding a baseline) to reduce variance of the estimator(s).
sample – samples to use in estimation. If not provided, they’re sampled. Default is None.
sample_size (int) – default sample size to use if sample is
- Returns:
the product of the Hessian matrix of the objective function J at theta with x.
- Return type:
np.ndarray
- generate_uniform_random_sample(sample_size)[source]#
Sample the probability space uniformly (assuming 0.5 probabilities).
- Parameters:
sample_size (int) – sample size
- Returns:
sampled_design_indexes: a list of indexes of binary variables sampled
sampled_design_objval: a list contianing objective values corresponding to sampled_design_indexes
- solve(initial_policy=None)[source]#
Start solving the binary optimization problem using the stochastic formulatioon.
- Parameters:
initial_policy – None or ``iterable` to be used as the initial policy parameter of the underlying Bernoulli distribution
- Returns:
an instance of BinaryReinforceOptimizerResults.
- Raises:
TypeError
if the initial policy has wrong size or any of the values fall outside the interval [0, 1]
Note
Steps of the algorithm are as follows:
Start with the initial state
Calculate the stochastic gradient with/without baseline
Optionally, calculate Hessian (approximation)
Take a step in the direction of:
-stepsize gradient ; for minimization (maximize is False)
+stepsize` gradient ; for maximization (maximize is True)
- multi_solve(initial_policy=None, num_solves=10)[source]#
A wrapper around
sovle()
which re-solves the optimization problem multiple times dectated by num_solves and returns a list of dictionaries holding results for each time the problem is solved.- ..note::
This function only iterates over
solve()
and reset the objective value tracker in the underlying configurations dictionary so that the collected results reflect analysis for each run properly. This enables empirical verification of the robustness of the algorithm
- Parameters:
initial_policy – the initial policy parameter of the underlying Bernoulli distribution. This is the start solution of each of the num_solves. If None, the one in the configurations is used.
num_solves (int) – number of times the optimization problem is solved
- Returns:
A list with each entry being a dictionary holding results returned by
solve()
.
- plot_results(results, overwrite=False, bruteforce=False, fun_symbol='U', num_active=None, output_dir=None, keep_plots=False, fontsize=22, line_width=3, usetex=True, show_axis_grids=True, axis_grids_alpha=(0.25, 0.4), plots_format='pdf', **kwargs)[source]#
Given the results returned by
solve()
, visualize optimization results. This function, creates the following plots:Values of the optimal policy
The sampled designs from the optimal policy
Estimate of the stochastic objective value \(E[J(d)]\) over optimization iterations, where \(J\) is the deterministic objective to be optimized, and \(d\) is the binary design variable, and the expectation is taken over a multivariate Bernoulli distribution whose parameters (success probabilities) are optimized by
solve()
If found relevant entries in the results object, a random sample with uniformly sampled realizations of :\(d\) is generated and the value of the objective of this sample is displayed alongside the optimization trajectory to show effectiveness of the optimization.The number of new function evaluations carried out at each iteration of the optimization procedure. This is created only if monitor_iterations is set to True in the configurations dictionary.
Bruteforce results plot if bruteforce is set to True. Note that this is exponentially computationally demanding; so, just use for small size problems.
Note
Adding additional kwargs to enable high flexibility in calling this method.
- Parameters:
results (dict | BinaryReinforceOptimizerResults) – results returned by
solve()
(or a dictionary created from it, e.g., by calling BinaryReinforceOptimizerResults.asdict())bruteforce (bool) – create bruteforce plot(s)
fun_symbol (str) – symbol to use for labeling the objective function.
output_dir – location to save plots to; if None the current working directory is used
keep_plots (bool) – if True, the created figures will not be closed. This can be useful, e.g., for displaying results in notebooks
fontsize (int) – font size for plots
usetex (bool) – use LaTeX for displaying equations on plots (requires LaTeX support)
plots_format (str) – plots format/extension
- Returns:
a dictionary with two entries/keys:
figures a dictionary indexed by plot identifier with value set to the corresponding figure handle (or None if keep_plots is set to False)
stats: a dictionary holding statistics including number of function evaluations, and percentage of space explored.
- Raises:
TypeError
if the results dictionary is invalid‘optimization_trajectory_best_sample_objval’
‘best_along_the_route’
‘uniform_random_sample_objval’
- classmethod scale_gradient(x, grad, lb=0, ub=1)[source]#
Scaling gradient to the interval (if exceeds after added to x) lb=0, ub=1 with x being current point.
- ..note::
This is a projector that is applied to scale down a gradient grad if (when added to x) exceeds given bounds (lb, ub). Unlike pyoed.utility.truncate() which trims down the gradient vector at the bounds in all directions, this function scales the gradient by multiplying it with a scaling factor evaluated based on surplusses in all directions. If nothing is outside the domain, the scaling factor is 1. Otherwise, it is set to the number that makes the larges excess (beyond lb or ub) is brought does to the corresponding extremum (i.e., lb or ub respectively). This makes sure the gradient update (x+grad) lies within the bounds (lb, ub) and makes it much easier to tune the learning rate (timestep) of the optimizer.
- Parameters:
x – the current state vector (e.g., target of optimization problem)
grad – the gradient of the optimization objective (same size as x)
lb – lower bound of the domain. This could be a scalar or 1d array of the same size as x (and grad)
ub – upper bound of the domain. This could be a scalar or 1d array of the same size as x (and grad)
- Returns:
a scaled version of the gradient (a copy is created if the scaling factor is not 1.)
- Raises:
ValueError
if the lower bound(s) are not strictly less than upper bound(s).- Raises:
TypeError
if x and grad (and ‘lb’, ub if arrays) are not of the same size.- Raises:
TypeError
if lb and ub are not similar in type/shape. That is, an exception is raised if for example lb is a number and ub is an array. Mixing could be handled but for cleaneliness, we don’t allow it.
- property size#
Dimension/size of the optimziation variable.
- property objective_value_tracker#
Reference to the objective value tracker dictionary for easy access
- property Bernoulli_random_sampler#
The underlying Bernoulli Random Sampmler
- class BinarySAAOptimizer(*args, **kwargs)[source]#
Bases:
Optimizer
,RandomNumberGenerationMixin
Stochastic Average Approximation (SAA) algorithm for binary optimization