Constrained Probabilistic Binary Optimization#

ConstrainedBinaryReinforceOptimizer([configs])

ConstrainedBinaryReinforceOptimizerConfigs(*)

Configuration dataclass for the ConstrainedBinaryReinforceOptimizer.

ConstrainedBinaryReinforceOptimizerResults(*)

Container for results/data generated by the ConstrainedBinaryReinforceOptimizer.

A module providing access to multiple Binary optimization routines with Hard Budget Constraints.

class ConstrainedBinaryReinforceOptimizerConfigs(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', size=None, name='ConstrainedBinaryReinforceOptimizer: Binary Optimization via Stochastic Learning with Budget Constraints', screen_output_iter=1, file_output_iter=100, fun=None, initial_policy=None, maximize=True, def_stepsize=0.1, decay_step=False, Newton_step=False, stochastic_gradient_sample_size=32, baseline='optimal', optimal_baseline_batch_size=32, optimal_sample_size=5, objective_value_tracker=None, maxiter=10000, pgtol=1e-12, monitor_iterations=True, random_seed=None, antithetic=True, uniform_random_sample_size=0, num_resets=0, budgets=None, budget_enforcement_approach='sum-as-variable', bad_objective_value=0.0, max_num_of_rejections=100000)[source]#

Bases: BinaryReinforceOptimizerConfigs

Configuration dataclass for the ConstrainedBinaryReinforceOptimizer.

Note

This inherits all configurations provided by BinaryReinforceOptimizerConfigs, and adds the parameters/configurations provided below.

Parameters:
  • verbose (bool) – a boolean flag to control verbosity of the object.

  • debug (bool) – a boolean flag that enables adding extra functionlity in a debug mode

  • output_dir (str | Path) – the base directory where the output files will be saved.

  • name (str) – name of the optimizer. Default is None.

  • screen_output_iter (int) – iteration interval for screen output. Default is 1. Note that this should be a positive integer to enforce proper effect.

  • file_output_iter (int) – iteration interval for file output. Default is 1. Note that this should be a positive integer to enforce proper effect.

  • size (int | None) – dimension of the target/solution variable. Must be a positive integer

  • fun (Callable | None) – the objective function to be optimization. For a given binary state/design of size size it returns a scalar.

  • initial_policy (ndarray | Iterable | None) – the initial probabilistic policy parameter (activation/success probability of the Bernoulli distribution)

  • maximize (bool) – if True the objective is to be maximized, otherwise minimization is assumed

  • def_stepsize (float) – default step size (learning rate) in the direction (or the opposite) of the gradient. Default is 1.

  • decay_step (bool) – if True reduce the learning rate (step size) as the algorithm proceeds

  • Newton_step (bool) – if True use second-order derivative to define the update direction

  • stochastic_gradient_sample_size (int) – size of the sample to generate (from current policy at each iteration) to estimate (using MC) the policy gradient

  • baseline (str | None) –

    name/type of the baseline used to reduce variability of the stochastic estimate of the gradient to use (if any); possible values are None, ‘none’, ‘optimal’, ‘heuristic’. These are explained as follows:

    • ’none’ | None: no baseline is used.

    • ’heuristic’: evaluate an empirical baseline \((J(0)+J(1))/2\) with \(J\) being the passed fun.

    • ’optimal’: a constant baseline is found by approximating the optimal baseline where optimality is defined as minimum variability of the estimator.

  • optimal_baseline_batch_size (int) – size of the batch/sample to use for optimal baseline estimation; used if baseline==True.

  • optimal_sample_size (int) – size of the sample to return from the final/optimal policy upon termination of the optimizer

  • objective_value_tracker (dict | None) –

    a dictionary that holds values of the objective function for sampled binary vectors. This can be helpful if the optimizer was run previously and objective vlaues have been saved. This is used as the initializer of the tracker used while solving the optimization problem.

    Note

    The objective_value_tracker in the configurations object is not updated as the solver proceeds. It is only used as the initializer of the internal tracker. The results returned from the optimizer contain another copy of the objective_value_tracker with updated sampled binary vectors and corresponding objectives.

  • uniform_random_sample_size (int) – size of a random sample generated uniformly (0.5 probability) for performance evaluation.

  • num_resets (None | int) – number of resets during optimization

  • budgets (None | Iterable[int]) – None or an iterable (of ints) with allowed/feasible budgets. Any budget must be between 0, and the size of the binary variable (inclusive). If None, no budget-constraint is asserted; this is equivalent to setting budget to include all budgets between 0, and the size of the binary variable (inclusive).

  • budget_enforcement_approach (str) –

    name of the approach to be followed for asserting budget constraint(s); the following are allowed:

    1. ’sum-as-variable’: Insert the sum of Bernoulli variables (sum of entries) of binary states/designs into the formulation by multiplying each term in the expectation by the probability associated with the sum. The sum in this case is a Binomial-Poisson random variable. This is the DEFAULT and is the trusted approach.

    2. ’sampling-by-rejection’: sample the whole space but reject any infeasible point (similar to rejection sampling).

    3. ’sampling-by-random-selection’: randomly select entries in the vector to probabilistically activate based on current policy. The rest are zeroed out. This is experimental, and will be updated

    4. ’modify-objective’: set objective value of sampled designs to zero if they violate constraints.

    5. ’zero-probability’: set probability of sampled designs to zero if they violate constraints; this means they are discarded/rejected if they are captured and are not used for stochastic gradient evaluation.

  • bad_objective_value (float) – a scalar to be used as a bad objective value, e.g., 0 for maximization for non-negative functions. This value is used for infeasible points if budget_enforcement_approach is set to ‘modify-objective’ otherwise it is discarded.

  • max_num_of_rejections (int) – the maximum number of rejected samples (when rejection sampling is employed, i.e., when budget_enforcement_approach is set to ‘sample-by-rejection’) before the sampler (consequently the optimizer) throws an error in the solution process (at any point where sampling is required, e.g., for estimating the gradient following a stochastic approach).

budgets: None | Iterable[int]#
budget_enforcement_approach: str#
bad_objective_value: float#
max_num_of_rejections: int#
__init__(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', size=None, name='ConstrainedBinaryReinforceOptimizer: Binary Optimization via Stochastic Learning with Budget Constraints', screen_output_iter=1, file_output_iter=100, fun=None, initial_policy=None, maximize=True, def_stepsize=0.1, decay_step=False, Newton_step=False, stochastic_gradient_sample_size=32, baseline='optimal', optimal_baseline_batch_size=32, optimal_sample_size=5, objective_value_tracker=None, maxiter=10000, pgtol=1e-12, monitor_iterations=True, random_seed=None, antithetic=True, uniform_random_sample_size=0, num_resets=0, budgets=None, budget_enforcement_approach='sum-as-variable', bad_objective_value=0.0, max_num_of_rejections=100000)#
class ConstrainedBinaryReinforceOptimizerResults(*, optimal_policy=<factory>, optimal_policy_sample=<factory>, optimal_policy_sample_objval=<factory>, optimal_design=<factory>, optimal_design_objval=None, optimization_trajectory=<factory>, optimization_trajectory_stoch_objval=<factory>, optimization_trajectory_best_sample=<factory>, optimization_trajectory_best_sample_objval=<factory>, best_along_the_route=<factory>, converged=False, maxiter_reached=False, message=<property object>, objective_value_tracker=<factory>, sampled_design_indexes=<factory>, num_resets=None, reset_at_iterations=<factory>, uniform_random_sample_indexes=None, uniform_random_sample_objval=None, configurations=None, budgets=None)[source]#

Bases: BinaryReinforceOptimizerResults

Container for results/data generated by the ConstrainedBinaryReinforceOptimizer. This inherits all keys/attributes in BinaryReinforceOptimizer in addition to the following parameters.

Parameters:
  • optimal_policy (ndarray) – the parameters of the optimal policy (probability of activation of each entry of the design)

  • optimal_policy_sample (ndarray) – design sample generated by sampling the optimal policy (Bernoulli samples with probababilities set to the optimal policy parameters optimal_policy_parameters)

  • optimal_policy_sample_objval (ndarray) – value of the objective function \(J\) evaluated at each design in optimal_sample

  • optimal_design (ndarray) – the optimal design (binary vector) sampled from the optimal_policy.

  • optimal_design_objval (float | None) – the value of the objective function evaluated at the above optimal design.

  • optimization_trajectory (ndarray) – trajectory (consecutive parameters) of the policy parameters generated along the optimization iterations; populated only if monitor_iterations configuration is set to True.

  • optimization_trajectory_stoch_objval (ndarray) – value (MC estimate) of the stochastic objective function, that is \(E[J]\) along the optimization trajectory; populated only if monitor_iterations configuration is set to True.

  • optimization_trajectory_best_sample (ndarray) – the best sample point at each iteration. This is the best sampled design at each iteration. For maximization, it is the sampled design with maximum objective value (among the sample)

  • optimization_trajectory_best_sample_objval (ndarray) – the objective value corresponding to each design in optimization_trajectory_best_sample.

  • converged (bool) – True if the algorithm converged, False otherwise.

  • msg – a string message describing the final status of the algorithm

  • objective_value_tracker (dict | None) – a dictionary containing the value of the objective function evaluated through the course of the algorithm. The dictionary is indexed by the integer indexes of the generated binary states.

  • sampled_design_indexes (list[list[int]]) – a list (of lists) containing all sampled design indexes for each iteration of the algorithm. You can generate the binary designs themselves by calling pyoed.utility.math.index_to_binary_state() on the indices.

  • best_along_the_route (dict) –

    a dictionary with two entries/keys (str):

    • ’objval’: this key/value entry holds the best objective value seen during the exploration (before collecting the sample from the optimal policy).

    • ’designs’: a list of binary designs corresponding to the objective value ‘objval’.

  • num_resets (int | None) – number of resets during optimization (if set in the configurations). if given it must be be a positive integer indicating the number of times the optimizer is reset to initial_policy after convergence. If either 0 or None is given, num_resets is set to ‘0’ thus not resets are carried out

  • reset_at_iterations (list) – a list holding the iterations after which solver was reset (if valid value of num_resets is set in the configurations).

  • configurations (None | dict) – a dictionary holding problem configurations. This can be obtained by calling asdict() of the configurations object used by the optimizer, e.g., BinaryReinforceOptimizerConfigs.

  • budgets (None | Iterable[int]) – Iterable (of ints) with allowed/feasible budgets. Any budget must be between 0, and the size of the binary variable (inclusive). If None, no budget-constraint is asserted; this is equivalent to setting budget to include all budgets between 0, and the size of the binary variable (inclusive).

budgets: None | Iterable[int]#
__init__(*, optimal_policy=<factory>, optimal_policy_sample=<factory>, optimal_policy_sample_objval=<factory>, optimal_design=<factory>, optimal_design_objval=None, optimization_trajectory=<factory>, optimization_trajectory_stoch_objval=<factory>, optimization_trajectory_best_sample=<factory>, optimization_trajectory_best_sample_objval=<factory>, best_along_the_route=<factory>, converged=False, maxiter_reached=False, message=<property object>, objective_value_tracker=<factory>, sampled_design_indexes=<factory>, num_resets=None, reset_at_iterations=<factory>, uniform_random_sample_indexes=None, uniform_random_sample_objval=None, configurations=None, budgets=None)#
class ConstrainedBinaryReinforceOptimizer(configs=None)[source]#

Bases: BinaryReinforceOptimizer

__init__(configs=None)[source]#
validate_configurations(configs, raise_for_invalid=True)[source]#

Check the passed configuratios and make sure they are conformable with each other, and with current configurations once combined. This guarantees that any key-value pair passed in configs can be properly used

Note

Here only the locally-defined configurations in ConstrainedBinaryReinforceOptimizerConfigs are validated. Finally, super classes validators are called.

Parameters:
  • configs (dict | ConstrainedBinaryReinforceOptimizerConfigs) – full or partial (subset) configurations to be validated

  • raise_for_invalid (bool) – if True raise TypeError for invalid configrations type/key. Default True

Returns:

flag indicating whether passed configurations dictionary is valid or not

Raises:
Return type:

bool

update_configurations(**kwargs)[source]#

Take any set of keyword arguments, and lookup each in the configurations, and update as nessesary/possible/valid

Raises:

PyOEDConfigsValidationError – if invalid configurations passed

update_random_number_generators(random_seed)[source]#

Update the underlying random number generator(s) given the passed random seed This updates two random number generators; the one here and the generator associated with the underlying Bernoulli distribution

register_budgets(budgets)[source]#

Set the budget (the sum of the Bernoulli random variable to condition on) The budget could be a number (integer) or set of numbers. The probability of each budget/size is recalculated. In the former case, the distribution is identical to the parent class. In the latter, the probability is calculated by conditioning on the union of all budgets.

Parameters:

budgets (int|iterable(int)) – either an integer or an iterable e.g., list of integers, defining acceptable budgets (sum of the Bernoulli random variable).

Raises:

TypeError – if the type of budgets is not acceptable.

check_registered_budgets()[source]#

Check/validate registerd budgets and their probabilities.

Returns:

the registerd budgets/sizes and the corresponding probabilities.

Raises:

TypeError if no valid budget is registered

reset_budget_constraints()[source]#

Remove any budget constraints. To add/register constraints call register_budget_constraint().

get_feasible_budgets()[source]#

Retrieve a list of all feasible budgets based on the registered constraints.

is_feasible(design)[source]#

Test if the passed design satisfies feasibility (initially budget) constraints

Parameters:

design

Returns:

bool; True if the design satisfies all constraints

generate_sample(theta, sample_size=1, enforce_constraint=True, dtype=<class 'bool'>)[source]#

Generate sample of size sample_size for a given policy parameter (theta) while accounting for budget constraints based on the approach specified in the configurations in budget_enforcement_approach

Parameters:
  • theta (numpy.ndarray|iterable) – the policy parameters (success probability of the underlying Bernoulli distribution)

  • sample_size (int) – number of desings/states to sample from the Bernoulli distribution with with the passed parameter theta

  • enforce_constraint (bool) – if True, check the budget_enforcement_approach in the configurations, and modify the PMF value if the selected approach implies modifying the PMF`.

  • dtype (type) – data type to set for entries of the returned sample array

Returns:

  • sample: 2d array of shape (sample_size x dim) where dim is the size of theta

Note:

the probabilities or the objective values are modified based on the constraints enforcement method if specified in the configurations

Raises:

ValueError if the configured budget enforcement method is not recognized/acceptable

objective_function_value(design, enforce_constraint=True)[source]#

A function to prevent redundant computations of the objective (utility function) by looking up tracker before evaluation The PMF is potentially modified if enforce_constraint is True, however, the tracker keeps the unmodified objective value.

Parameters:

design – binary state/design variable to evaluate objective value at

Returns:

value of the objective/utility function at the passed design

Raises:

TypeError if the passed design is not of the right size

evaluate_pmf(design, theta, enforce_constraint=True)[source]#

Evaluate the value of the probability mass function (PMF) for a given sampled design. The PMF is potentially modified if enforce_constraint is True.

Parameters:
  • design – binary state/design variable to evaluate objective value at

  • theta (numpy.ndarray|iterable) – the policy parameters (success probability of the underlying Bernoulli distribution)

  • enforce_constraint (bool) – if True, check the budget_enforcement_approach in the configurations, and modify the PMF value if the selected approach implies modifying the PMF.

evaluate_batch_grad_log_pmf(design, theta, batch_as_column=False, zero_bounds=True)[source]#
evaluate_grad_log_pmf(design, theta, enforce_constraint=True)[source]#

Evaluate the gradient of the log of PMF value for a given sampled design. The PMF is potentially modified if enforce_constraint is True, and the gradient is calculated accordingly. Specifically, if the PMF value for a given entry is set to 0, the gradient of the log-pmf is set to 0.

Parameters:
  • design – binary state/design variable to evaluate objective value at

  • theta (numpy.ndarray|iterable) – the policy parameters (success probability of the underlying Bernoulli distribution)

  • enforce_constraint (bool) – if True, check the budget_enforcement_approach in the configurations, and modify the PMF value if the selected approach implies modifying the PMF.

Returns:

the gradient of the log probability associated with the passed design

evaluate_grad_pmf(design, theta, enforce_constraint=True)[source]#

Evaluate the gradient of the PMF value for a given sampled design. The PMF is potentially modified if enforce_constraint is True, and the gradient is calculated accordingly. Specifically, if the PMF value for a given entry is set to 0, the gradient of the log-pmf is set to 0.

Parameters:
  • design – binary state/design variable to evaluate objective value at

  • theta (numpy.ndarray|iterable) – the policy parameters (success probability of the underlying Bernoulli distribution)

  • enforce_constraint (bool) – if True, check the budget_enforcement_approach in the configurations, and modify the PMF value if the selected approach implies modifying the PMF.

Returns:

the gradient of the log probability associated with the passed design

optimal_baseline_estimate(theta, fun=None, sample=None, sample_objvals=None, sample_grad_log_pmf=None, batch_size=32, fail_safe_to_zero=True)[source]#

Use the passed sample to calculate a sample based estimate of the optimal baseline value.

Parameters:
  • theta (numpy.ndarray|iterable) – the policy parameters (success probability of the underlying Bernoulli distribution)

  • fun (callable) – the utility/objective function to calculate the stochastic objective. If None, the fun from the configurations dictionary is used. This argument is provided to enable altering the objective while calculating the derivatives (e.g., by adding a baseline) to reduce variance of the estimator(s).

  • sample – Given sample

  • sample_objvals – The objective value (fun(x) where fun is the registered function to be optimized) for each x in the passed sample.

  • sample_grad_log_pmf – The gradient of the logarithm of the probability mass function (grad-log-PMF) of the underlying bernoulli (conditional or not) model. function to be optimized) for each x in the passed sample.

  • batch_size (int) – number of samples to use for the MC approximation. Default is 32.

  • fail_safe_to_zero (bool) – if the baseline calculator failed (found nan/inf vlue) reset the baseline to zero and raise a warning, otherwise raise a ValueError.

Returns:

optimal baseline estimate

Return type:

float

solve(initial_policy=None)[source]#

Start solving the constrained binary optimization problem using the probabilistic/ stochastic formulatioon. Based on the budget enforcement approach in the configurations, the binary random variable is modeled either by a Bernoulli distribution (similar to :py:class`BinaryReinforceOptimizerResults`) or by a conditional Bernoulli model which restricts the probability support (region of the space with non-zero probability) to the feasible reguion (values of the variable satisfying the budget constraints).

Parameters:

initial_policyNone or ``iterable` to be used as the initial policy parameter of the underlying Bernoulli distribution

Returns:

an instance of ConstrainedBinaryReinforceOptimizerResults.

Raises:

TypeError if the initial policy has wrong size or any of the values fall outside the interval [0, 1]

Note

The conditional bernoulli model is only used if the budget enforcement approach in the configurations self.configurations is set to ‘sum-as-random-variable’.

Note

Steps of the algorithm are as follows:

  1. Start with the initial state

  2. Calculate the stochastic gradient with/without baseline

  3. Optionally, calculate Hessian (approximation)

  4. Take a step in the direction of:

  • -stepsize gradient ; for minimization (maximize is False)

  • +stepsize` gradient ; for maximization (maximize is True)

plot_results(results, overwrite=False, bruteforce=False, fun_symbol='U', output_dir=None, keep_plots=False, fontsize=20, line_width=2, usetex=True, show_axis_grids=True, axis_grids_alpha=(0.25, 0.4), plots_format='pdf', **kwargs)[source]#

Given the results dictionary returned by solve(), visualize optimization results.

Note

This is a wrapper around BinaryReinforceOptimizer.plot_results() which only updates the returned statistics based on the size of the probability space which depends on the budget enforcement approach employed.

Returns:

a dictionary with two entries/keys:

  • figures a dictionary indexed by plot identifier with value set to the corresponding figure handle (or None if keep_plots is set to False)

  • stats: a dictionary holding statistics including number of function evaluations, and percentage of space explored.

Raises:

TypeError if the results dictionary is invalid

bruteforce_objective_values(fun=None, num_active=None, objective_value_tracker=None)[source]#

Evaluate the value of the objective (obj) at all possible binary combinations;

Note

This is a wrapper around BinaryReinforceOptimizer.bruteforce_objective_values() which only specifies the number of active entries (budgets) argument num_active to the registered budgets in self.configurations.budgets.

Note

The argument num_active here is added for compatibility with super’s method bruteforce_objective_values. Thus, num_active should not be passed as it is set to the registered budgets. It will be used, however, if explicitly passed.

Returns:

a dictionary of objective values evaluated for each possible binary instance. The dictionary indexed by unique integer indexes.

Return type:

dict[int,float]

..note::

This function should be used ONLY for testing in small-dimensional problems.

property budgets#

Return the registerd budtet sizes iterable (range/list/etc.)

property Bernoulli_random_sampler#

The underlying Bernoulli Random Sampler Since we sampler can be standard Bernoulli, or a Generalized Conditional Bernoulli, we return teh proper sampler based on the budget enforcement approach.

property budget_enforcement_approach#

The approach used to enforce budget constraints