Probability Distributions/Models#

Entries on this page:

Note

All distribution classes inherit the distribution base class (Distribution) and each distribution class/object is associated with a configurations class derived from the distribution configurations base class (DistributionConfigs)

Multivariate Bernoulli Model#

class BernoulliConfigs(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', name='Mulitvariate Bernoulli Distribution (iid)', random_seed=None, parameter=0.5)[source]#

Bases: DistributionConfigs

Configurations class for the Bernoulli abstract base class. This class inherits functionality from DistributionConfigs and only adds new class-level variables which can be updated as needed.

See DistributionConfigs for more details on the functionality of this class along with a few additional fields. Otherwise Bernoulli provides the following fields:

Parameters:
  • verbose (bool) – a boolean flag to control verbosity of the object.

  • debug (bool) – a boolean flag that enables adding extra functionality in a debug mode

  • output_dir (str | Path) – the base directory where the output files will be saved.

  • random_seed (int | None) – random seed used for pseudo random number generation

  • name (str) – name of the distribution

  • parameter (float | Iterable[float]) – probability of success of the bernoulli trials. This determins the dimension of the probability distribution

parameter: float | Iterable[float]#
__init__(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', name='Mulitvariate Bernoulli Distribution (iid)', random_seed=None, parameter=0.5)#
class Bernoulli(configs=None)[source]#

Bases: Distribution

An implementation of the multivariate Bernoulli Distribution with independent components (no covariances).

Parameters:

configs (dict | BernoulliConfigs | None) – (optional) configurations for the model

__init__(configs=None)[source]#

Initialize the random number generator

validate_configurations(configs, raise_for_invalid=True)[source]#

Validation stage for the the passed configs.

Parameters:

configs (dict | BernoulliConfigs) – configurations to validate. If a BernoulliConfigs object is passed, validation is performed on the entire set of configurations. However, if a dictionary is passed, validation is performed only on the configurations corresponding to the keys in the dictionary.

Raises:
update_configurations(**kwargs)[source]#

Take any set of keyword arguments, and lookup each in the configurations, and update as nessesary/possible/valid

update_parameter(p)[source]#

Update the success probability and any dependent values/configurations

sample(sample_size=1, antithetic=False, dtype=<class 'bool'>)[source]#

Sample a Bernoulli random variable (1d or multivariate) according to probability of success p, that is :\(p:=(P(x=1))\). If antithetic is True, sample_size must be even

Parameters:
  • sample_size (int) – size of the sample to generate

  • antithetic (bool)

  • dtype (type) – data type of the returned array

Returns:

bernoulli_sample: array of shape sample_size x n where n is the size of p

Raises:

:py:class`TypeError` if the sample_size is invalid

Note

If p is scalar or iterable of length 1, this will be 1d array of size=sample_size. Otherwise, if p is multivariate, this will be 2d array with each row representing one sample.

uniform_random_sample(sample_size=1, antithetic=False, dtype=<class 'bool'>)[source]#

Generate sample(s) from the created distribution with parameter setup to yield equal probabilities of all points in the distribution support. This is usefule for generating uninformative (completely blind) random samples from a given distribution. For example, a Bernoulli distribution can provide samples where the probability of sucess is set to 0.5.

expect(func, objective_value_tracker=None)[source]#

Calculate the expected value of a function (func) which accepts w as parameter

pmf(x, joint=True, batch_as_column=False)[source]#

Calculate the value of the probability mass function (PMF) of a uncorrelated multivariate Bernoulli distribution, evaluated at given binary state or a batch of states (random variable realization) x, and with parameters defined by the underlying probability of success.

Parameters:
  • x – scalar, or 1D numpy array, or binary values (0/10) or bytes

  • joint (bool) – if True joint PMF value is returned, otherwise marginal PMF for all entries is returned.

  • batch_as_column – Only used if x is 2d array. if Ture, and x is two dimensional, each column is regarded as instance of the random variable (default), otherwise, each row is taken as a random variable.

Returns:

pmf (or batch of pmf values) of the probabiltiy of the Bernoulli model

Raises:

TypeError if the passed x has wrong shape/size

log_pmf(x, joint=True, batch_as_column=False)[source]#

Calculate the value of log-PMF (probability mass function) of a uncorrelated multivariate Bernoulli distribution

Parameters:
  • x – scalar, or 1D numpy array, or binary values (0/10) or bytes

  • joint (bool) – if True joint PMF value is returned, otherwise marginal PMF for all entries is returned.

  • batch_as_column – Only used if x is 2d array. if Ture, and x is two dimensional, each column is regarded as instance of the random variable (default), otherwise, each row is taken as a random variable.

Returns:

log of pmf (or batch of log-pmf values) of the Bernoulli model

Raises:

TypeError if the passed x has wrong shape/size

grad_pmf(x, joint=True, batch_as_column=False)[source]#

Calculate the gradient of the probability mass function (PMF) of a uncorrelated multivariate Bernoulli distribution, evaluated at given binary state or a batch of states (random variable realization) x, and with parameters defined by the underlying probability of success.

Parameters:
  • x – scalar, or 1D numpy array, or binary values (0/10) or bytes

  • joint (bool) – if True gradient of the joint PMF value is returned, otherwise gradient of marginal PMFs for all entries is returned.

  • batch_as_column – Only used if x is 2d array. if Ture, and x is two dimensional, each column is regarded as instance of the random variable (default), otherwise, each row is taken as a random variable.

Returns:

grad pmf (or batch of grad pmf values) of the probabiltiy of the Bernoulli model

Raises:

TypeError if the passed x has wrong shape/size

grad_log_pmf(x, joint=True, zero_bounds=True, batch_as_column=False)[source]#
Calculate the gradient of log-PMF (probability mass function),

with respect to distribution parameters (success probability)

Parameters:
  • x – scalar, or 1D numpy array, or binary values (0/10) or bytes

  • joint (bool) – if True joint PMF is considered, otherwise marginal PMF is used.

  • batch_as_column – Only used if x is 2d array. if Ture, and x is two dimensional, each column is regarded as instance of the random variable (default), otherwise, each row is taken as a random variable.

Returns:

grad log pmf (or batch of grad log pmf values) of the probabiltiy of the Bernoulli model

Raises:

TypeError if the passed x has wrong shape/size

Note

Given the assumption that the Bernoulli RVs modeled are uncorrelated, the gradient of log-probabilities is same as partial derivatives of corresponding derivatives of log-prob of each entry; thus, whether joint is True or False the result is the same.

index_to_binary_state(k, dtype=<class 'bool'>)[source]#

Return the binary state=:math:(v_1, v_2, dots) of dimension as the size of this distribution, with index k.

..note::

This is actually a wrapper around the utility function pyoed.utility.math.index_to_binary_state which is added here only for convenience.

index_from_binary_state(state)[source]#

Reverse of “index_to_binary_state” Return the index k corresponding to the passed state (of dimension=size).

..note::

This is actually a wrapper around the utility function pyoed.utility.math.index_from_binary_state which is added here only for convenience.

property parameter#

Return the underlying probability of success

property success_probability#

Return the underlying probability of success

property size#

Return the dimentionsize of the underlying probability space

property parameter_size#

Return the dimentionsize of the underlying probability space

Poisson Binomial Distribution#

class PoissonBinomialConfigs(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', name='Poisson Binomial Distribution', random_seed=None, parameter=0.5, R_function_evaluation_method='tabulation')[source]#

Bases: BernoulliConfigs

Configurations class for the PoissonBinomial abstract base class. This class inherits functionality from PyOEDConfigs and only adds new class-level variables which can be updated as needed.

See PyOEDConfigs for more details on the functionality of this class along with a few additional fields. Otherwise PoissonBinomial provides the following fields:

Parameters:
  • verbose (bool) – a boolean flag to control verbosity of the object.

  • debug (bool) – a boolean flag that enables adding extra functionality in a debug mode

  • output_dir (str | Path) – the base directory where the output files will be saved.

  • random_seed (int | None) – random seed used for pseudo random number generation

  • parameter (float | Iterable[float]) – probability of success of the bernoulli trials. This determins the dimension of the probability distribution

  • name (str) – name of the distribution

  • R_function_evaluation_method (str) – the name of the evaluation metho of the R-function. See RFunctionConfigs for supported evaluation methods.

R_function_evaluation_method: str#
__init__(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', name='Poisson Binomial Distribution', random_seed=None, parameter=0.5, R_function_evaluation_method='tabulation')#
class PoissonBinomial(configs=None)[source]#

Bases: Bernoulli

An implementation of the Poisson-binomial distribution which models the sum of independent (non-identical) Bernoulli trials. This version uses Discrete Fourier Transform to calculate probabilities and derivatives following Method 1 in [1] which evaluates \(R(n, S)\) as a series. For details see [2].

Parameters:

configs (dict | PoissonBinomialConfigs | None) – (optional) configurations for the model

References:

  1. Sean X. Chen, and Jun S. Liu. “Statistical applications of the Poisson-binomial and conditional Bernoulli distributions.” Statistica Sinica (1997): 875-892.

  1. Ahmed Attia. “Probabilistic Approach to Black-Box Binary Optimization with

    Budget Constraints: Application to Sensor Placement.” arXiv preprint arXiv:2406.05830 (2024).

__init__(configs=None)[source]#

Initialize the random number generator

validate_configurations(configs, raise_for_invalid=True)[source]#

Validation stage for the the passed configs.

Parameters:

configs (dict | PoissonBinomialConfigs) – configurations to validate. If a PoissonBinomialConfigs object is passed, validation is performed on the entire set of configurations. However, if a dictionary is passed, validation is performed only on the configurations corresponding to the keys in the dictionary.

Raises:
update_configurations(**kwargs)[source]#

Take any set of keyword arguments, and lookup each in the configurations, and update as nessesary/possible/valid

calculate_w(p, dtype=<class 'decimal.Decimal'>, log=False, undefined_as_nan=False)[source]#

Calculate Bernoulli weights w from success probabilities p. The weights are defined as:

\[w_i = \frac{p_i}{1-p_i}\]

Note

The weights cannot be evaluated for any value of p equal to 1.

Note

This is a wrapper around RFunction.calculate_w()

Parameters:
  • p (Iterable[float]) – a sequence of success probabilities.

  • dtype (type) – Data type (must be a callable to transform into the input to the desired data type)

  • log (bool) – return the logarithm of w if True

  • undefined_as_nan (bool) – if True set the value of w to nan for any value of the probability outside the domain [0, 1)]

Returns:

a sequence (of the same length as p) with weights of type :py:class`decimal.Decimal`

Raises:

ValueError – if any of the probabilities are no in the interval [0, 1) and undefined_as_nan is False

Return type:

Iterable[Decimal | float]

pmf(n)[source]#

Calculate the probability (probability mass function) of the sum of the multivariate Bernoulli Distribution. This funciton models the probability mass function (PMF) of a Poisson-Binomial distribution/model.

Parameters:

n – non-negative integer defining the sum (number of nonzero entries) of a multivariate Bernoulli random variable.

Returns:

value of the PMF of the Poisson-Binomial model/distribution.

Raises:

TypeError if n is not non-negative integer

log_pmf(n)[source]#

Calculate the log probability (log of the probability mass function) of the sum of the multivariate Bernoulli Distribution. This funciton models the logarithm of the probability mass function (PMF) of a Poisson-Binomial distribution/model.

Parameters:

n – non-negative integer defining the sum (number of nonzero entries) of a multivariate Bernoulli random variable.

Returns:

logarithm of the value of the PMF of the Poisson-Binomial model/distribution.

Raises:

TypeError if n is not non-negative integer

Raises:

ValueError if the probability mass function value is zero at n.

grad_pmf(n)[source]#

Calculate the derivative/gradient of the probability (probability mass function) of the sum of the multivariate Bernoulli Distribution. This funciton models the gradient of the probability mass function (PMF) of a Poisson-Binomial distribution/model.

Note

This function calculates gradient of sum_pmf() with respect to the distribution parameter, i.e., the probability of successes.

Parameters:

n – non-negative integer defining the sum (number of nonzero entries) of a multivariate Bernoulli random variable.

Returns:

gradient of the PMF of the Poisson-Binomial model/distribution.

Raises:

TypeError if n is not non-negative integer

grad_log_pmf(n, zero_bounds=True)[source]#

Calculate the derivative/gradient of the log-probability (logarithm of the probability mass function) of the sum of the multivariate Bernoulli Distribution. This funciton models the gradient of the log-probability mass function (PMF) of a Poisson-Binomial distribution/model.

Note

This function calculates gradient of sum_log_pmf() with respect to the distribution parameter, i.e., the probability of successes.

Parameters:
  • n – non-negative integer defining the sum (number of nonzero entries) of a multivariate Bernoulli random variable.

  • zero_bounds (bool) – if True single-out (set to zero) any entries with zero or 1 probability

Returns:

gradient of the log-PMF of the Poisson-Binomial model/distribution.

Raises:

TypeError if n is not non-negative integer

Raises:

ValueError if the probability mass function value is zero at n and zero_bounds is False.

sample(sample_size=1)[source]#

Sample a Poisson binomial random variable according to the PMF calculated from all possible values. This requires calculating PMF for all values of the sum (n).

Parameters:

sample_size (int) – size of the sample to generate

Returns:

a sample of n values (the sum of bernoulli trials) calculated based on success probabilities of the trials.

Raises:

ValueError if the sample size is not a positive integer.

uniform_random_sample(sample_size=1)[source]#

Generate sample(s) from the created distribution with parameter setup to yield equal probabilities of all points in the distribution support. This is usefule for generating uninformative (completely blind) random samples from a given distribution. For example, a Bernoulli distribution can provide samples where the probability of sucess is set to 0.5.

expect(func)[source]#

Calculate the expected value of a function (func) which accepts scalars n (the bernoulli sum ) as parameter/argument.

property R_function#

A handler to the underlying R-Function instance

Conditional Bernoulli Model#

class ConditionalBernoulliConfigs(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', name='ConditionalBernoulli: Conditional Bernoulli probability model/distribution', random_seed=None, parameter=0.5, R_function_evaluation_method='tabulation')[source]#

Bases: BernoulliConfigs

Configurations class for the ConditionalBernoulliConfigs abstract base class. This class inherits functionality from PyOEDConfigs and only adds new class-level variables which can be updated as needed.

See PyOEDConfigs for more details on the functionality of this class along with a few additional fields. Otherwise ConditionalBernoulliConfigs provides the following fields:

Parameters:
  • verbose (bool) – a boolean flag to control verbosity of the object.

  • debug (bool) – a boolean flag that enables adding extra functionality in a debug mode

  • output_dir (str | Path) – the base directory where the output files will be saved.

  • random_seed (int | None) – random seed used for pseudo random number generation

  • parameter (float | Iterable[float]) – probability of success of the bernoulli trials. This determins the dimension of the probability distribution

  • name (str) – name of the distribution

  • R_function_evaluation_method (str) – the name of the evaluation metho of the R-function. See RFunctionConfigs for supported evaluation methods.

R_function_evaluation_method: str#
__init__(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', name='ConditionalBernoulli: Conditional Bernoulli probability model/distribution', random_seed=None, parameter=0.5, R_function_evaluation_method='tabulation')#
class ConditionalBernoulli(configs=None)[source]#

Bases: Bernoulli

An implementation of the conditional Bernoulli model. This models a multivariate bernoulli model condioned by the sum of the number of active entries.

For details see [1] and [2].

Parameters:

configs (dict | ConditionalBernoulliConfigs | None) – (optional) configurations for the model

References:

  1. Sean X. Chen, and Jun S. Liu. “Statistical applications of the Poisson-binomial and conditional Bernoulli distributions.” Statistica Sinica (1997): 875-892.

  1. Ahmed Attia. “Probabilistic Approach to Black-Box Binary Optimization with

    Budget Constraints: Application to Sensor Placement.” arXiv preprint arXiv:2406.05830 (2024).

__init__(configs=None)[source]#

Initialize the random number generator

validate_configurations(configs, raise_for_invalid=True)[source]#

Validation stage for the the passed configs.

Parameters:

configs (dict | ConditionalBernoulliConfigs) – configurations to validate. If a ConditionalBernoulliConfigs object is passed, validation is performed on the entire set of configurations. However, if a dictionary is passed, validation is performed only on the configurations corresponding to the keys in the dictionary.

Raises:
update_configurations(**kwargs)[source]#

Take any set of keyword arguments, and lookup each in the configurations, and update as nessesary/possible/valid

coverage_probability(i, n)[source]#

Given success probability, calculate the inclusion probability (coverage probability) for index i, where the index starts at 0 and ranges to size-1 where size is the dimension of the probability space, that is the size of p.

inclusion_probability(i, n)#

Given success probability, calculate the inclusion probability (coverage probability) for index i, where the index starts at 0 and ranges to size-1 where size is the dimension of the probability space, that is the size of p.

calculate_w(p, dtype=<class 'float'>, undefined_as_nan=False)[source]#

Calculate weights from success probabilities p.

sum_pmf(n)[source]#

Calculate the probability (probability mass function) of the sum of the multivariate Bernoulli Distribution. This funciton models the probability mass function (PMF) of a Poisson-Binomial distribution/model.

Parameters:

n – non-negative integer defining the sum (number of nonzero entries) of a multivariate Bernoulli random variable.

Returns:

value of the PMF of the Poisson-Binomial model/distribution.

Raises:

TypeError if n is not non-negative integer

sum_log_pmf(n)[source]#

Calculate the log probability (log of the probability mass function) of the sum of the multivariate Bernoulli Distribution. This funciton models the logarithm of the probability mass function (PMF) of a Poisson-Binomial distribution/model.

Parameters:

n – non-negative integer defining the sum (number of nonzero entries) of a multivariate Bernoulli random variable.

Returns:

logarithm of the value of the PMF of the Poisson-Binomial model/distribution.

Raises:

TypeError if n is not non-negative integer

grad_sum_pmf(n)[source]#

Calculate the derivative/gradient of the probability (probability mass function) of the sum of the multivariate Bernoulli Distribution. This funciton models the gradient of the probability mass function (PMF) of a Poisson-Binomial distribution/model.

Note

This function calculates gradient of sum_pmf() with respect to the distribution parameter, i.e., the probability of successes.

Parameters:

n – non-negative integer defining the sum (number of nonzero entries) of a multivariate Bernoulli random variable.

Returns:

gradient of the PMF of the Poisson-Binomial model/distribution.

Raises:

TypeError if n is not non-negative integer

grad_sum_log_pmf(n)[source]#

Calculate the derivative/gradient of the log-probability (logarithm of the probability mass function) of the sum of the multivariate Bernoulli Distribution. This funciton models the gradient of the log-probability mass function (PMF) of a Poisson-Binomial distribution/model.

Note

This function calculates gradient of sum_log_pmf() with respect to the distribution parameter, i.e., the probability of successes.

Parameters:

n – non-negative integer defining the sum (number of nonzero entries) of a multivariate Bernoulli random variable.

Returns:

gradient of the log-PMF of the Poisson-Binomial model/distribution.

Raises:

TypeError if n is not non-negative integer

pmf(x, n, batch_as_column=False)[source]#

Calculate the value of the probability mass function (PMF) of a Conditional Bernoulli distribution, evaluated at given binary state/realization x, and with parameters defined by the underlying probability of success.

Parameters:
  • x – scalar, or 1D numpy array, or binary values (0/10) or bytes

  • n – non-negative integer defining the sum to condition on.

Returns:

value of the PMF of the CB model (probabiltiy of :py:math``x`` conditioned by the sum)

Raises:

TypeError if the passed x has wrong shape/size and/or n is not non-negative integer

log_pmf(x, n, batch_as_column=False)[source]#

Calculate the log of the probability mass function (PMF) of a conditional Bernoulli distribution, evaluated at given binary state or a batch of states (random variable realization) x, and with registered parameters theta.

Note

This method is just a wrapper that chooses either _log_pmf() or _batch_log_pmf() based on whether x is 1d or 2d numpy array, respectively.

Parameters:
  • x – scalar, or 1D or 2D numpy array of binary values (0/10) or bytes. If x is 2D array, each COLUMN is regarded as one instance of the random variable, and the log-pmf is evaluated for each column If you want rows to be regarded as random variable, switch batch_as_column to False

  • n – non-negative integer defining the sum to condition on.

  • batch_as_column – Only used if x is 2d array. if Ture, and x is two dimensional, each column is regarded as instance of the random variable (default), otherwise, each row is taken as a random variable.

Returns:

log-pmf (or batch of log-pmf values) of the probabiltiy of the CB model

Raises:

TypeError if the passed x has wrong shape/size and/or n is not non-negative integer

grad_pmf(x, n, batch_as_column=False)[source]#

Calculate the gradient of the probability mass function (PMF) of a conditional Bernoulli distribution, evaluated at given binary state or a batch of states (random variable realization) x, and with parameters theta.

Note

This method is just a wrapper that chooses either _grad_pmf() or _batch_grad_pmf() based on whether x is 1d or 2d numpy array, respectively.

Parameters:
  • x – scalar, or 1D or 2D numpy array of binary values (0/10) or bytes. If x is 2D array, each COLUMN is regarded as one instance of the random variable, and the gradient is evaluated for each column. If you want rows to be regarded as random variable, switch batch_as_column to False

  • n – non-negative integer defining the sum to condition on.

  • batch_as_column – Only used if x is 2d array. if Ture, and x is two dimensional, each column is regarded as instance of the random variable (default), otherwise, each row is taken as a random variable.

Returns:

gradient (or batch of gradients) of the probabiltiy of the CB model

Raises:

TypeError if the passed x has wrong shape/size and/or n is not non-negative integer

grad_log_pmf(x, n, batch_as_column=False)[source]#

Calculate the gradient of the log-probability mass function (PMF) of a conditional Bernoulli distribution, evaluated at given binary state or a batch of states (random variable realization) x, and with parameters theta.

Note

Given the assumption that the Bernoulli RVs modeled are uncorrelated, the gradient of log-probabilities is same as partial derivatives of corresponding derivatives of log-prob of each entry; thus, whether joint is True or False the result is the same.

Parameters:
  • x – scalar, or 1D or 2D numpy array of binary values (0/10) or bytes. If x is 2D array, each COLUMN is regarded as one instance of the random variable, and the gradient is evaluated for each column. If you want rows to be regarded as random variable, switch batch_as_column to False

  • n – non-negative integer defining the sum to condition on.

  • batch_as_column – Only used if x is 2d array. if Ture, and x is two dimensional, each column is regarded as instance of the random variable (default), otherwise, each row is taken as a random variable.

Returns:

gradient (or batch of gradients) of the log-probabiltiy of the CB model

Raises:

TypeError if the passed x has wrong shape/size and/or n is not non-negative integer

grad_log_pmf_variance(n, total=True)[source]#

Variance (toal elementwise) of the gradient of the logarithm of the PMF. This is usefule for statistical analysis and probabilistic optimization approaches.

sample(n, sample_size=1, antithetic=False, dtype=<class 'bool'>)[source]#

Sample a Condional Bernoulli random variable (1d or multivariate) according to probability of success p, that is :\(p:=(P(x=1))\), of the underlying Bernoulli random variable. If antithetic is True, sample_size must be even

Parameters:
  • n – non-negative integer defining the sum to condition on.

  • sample_size (int) – size of the sample to generate

  • antithetic (bool)

  • dtype (type) – data type of the returned array

  • random_seedNone|int dictates the random seed to be used to initialize the underlying random number generator

Returns:

bernoulli_sample: array of shape sample_size x n where n is the size of p

Raises:

ValueError if n is out of range of possible values or invalid type or the sample size is not a positive integer.

Note

If p is scalar or iterable of length 1, this will be 1d array of size=sample_size. Otherwise, if p is multivariate, this will be 2d array with each row representing one sample.

expect(func, n, objective_value_tracker=None)[source]#

Calculate the expected value of a function (func) which accepts w as parameter

property poisson_binomial_model#

A handler to the underlying Poisson Binomial model instance

property R_function#

A handler to the underlying R-function instance

class GeneralizedConditionalBernoulliConfigs(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', name='GeneralizedConditionalBernoulli: Conditional Bernoulli model with multiple budgets', random_seed=None, parameter=0.5, R_function_evaluation_method='tabulation', budgets=None)[source]#

Bases: ConditionalBernoulliConfigs

Configurations class for the GeneralizedConditionalBernoulliConfigs abstract base class. This class inherits functionality from ConditionalBernoulliConfigs in addition to the following attributes/keys.

Parameters:
  • verbose (bool) – a boolean flag to control verbosity of the object.

  • debug (bool) – a boolean flag that enables adding extra functionality in a debug mode

  • output_dir (str | Path) – the base directory where the output files will be saved.

  • random_seed (int | None) – random seed used for pseudo random number generation

  • parameter (float | Iterable[float]) – probability of success of the bernoulli trials. This determins the dimension of the probability distribution

  • name (str) – name of the distribution

  • R_function_evaluation_method (str) – the name of the evaluation metho of the R-function. See RFunctionConfigs for supported evaluation methods.

  • budgets (None | Iterable[int]) – None or an iterable (of ints) with allowed/feasible budgets. Any budget must be between 0, and the size of the binary variable (inclusive). If None, no budget-constraint is asserted; this is equivalent to setting budget to include all budgets between 0, and the size of the binary variable (inclusive).

budgets: None | Iterable[int]#
__init__(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', name='GeneralizedConditionalBernoulli: Conditional Bernoulli model with multiple budgets', random_seed=None, parameter=0.5, R_function_evaluation_method='tabulation', budgets=None)#
class GeneralizedConditionalBernoulli(configs=None)[source]#

Bases: ConditionalBernoulli

A Generalization of the ConditionalBernoulli model where the sum is allowed to be a set of values rather than just one value.

__init__(configs=None)[source]#

Initialize the random number generator

update_configurations(**kwargs)[source]#

Take any set of keyword arguments, and lookup each in the configurations, and update as nessesary/possible/valid

register_budgets(budgets)[source]#

Set the budget (the sum of the Bernoulli random variable to condition on) The budget could be a number (integer) or set of numbers. The probability of each budget/size is recalculated. In the former case, the distribution is identical to the parent class. In the latter, the probability is calculated by conditioning on the union of all budgets.

Parameters:

budgets (int|iterable(int)) – either an integer or an iterable e.g., list of integers, defining acceptable budgets (sum of the Bernoulli random variable).

Raises:

TypeError if the type of budgets is not acceptable.

check_registered_budgets()[source]#

Check/validate registerd budgets and their probabilities.

Returns:

the registerd budgets/sizes and the corresponding probabilities.

Raises:

TypeError if no valid budget is registered

coverage_probability(i)[source]#

Calculate the inclusion probability (coverage probability) for index i, where the index starts at 0 and ranges to size-1 where size is the dimension of the probability space, that is the size of p. This is conditioned by the registered budget of course.

Note

Inclusion probability is the probability that 1 appears in a selected sample in the index i

pmf(x, batch_as_column=False)[source]#

Calculate the value of the probability mass function (PMF) of a Conditional Bernoulli distribution, evaluated at given binary state/realization x, and with parameters defined by the underlying probability of success. The variable x is conditioned by the registered budget.

Parameters:
  • x – 1D or 2D numpy array of binary values (0/10) or bytes. If x is 2D array, each COLUMN is regarded as one instance of the random variable, and the gradient is evaluated for each column. If you want rows to be regarded as random variable, switch batch_as_column to False

  • n – non-negative integer defining the sum to condition on.

  • batch_as_column – Only used if x is 2d array. if Ture, and x is two dimensional, each column is regarded as instance of the random variable (default), otherwise, each row is taken as a random variable.

Returns:

value of the PMF of the CB model (probabiltiy of :py:math``x`` conditioned by the sum)

Raises:

TypeError if the passed x has wrong shape/size and/or n is not non-negative integer

log_pmf(x, batch_as_column=False)[source]#

log-PMF conditioned by the registerd budgets. This returns the logarithm of pmf().

grad_log_pmf(x, batch_as_column=False, zero_bounds=True)[source]#

Calculate the gradient of the log-probability mass function (PMF) of a generalized conditional Bernoulli distribution, evaluated at given binary state or a batch of states (random variable realization) x, and with parameters theta.

Note

This method is just a wrapper that chooses either _grad_log_pmf() or _batch_grad_log_pmf() based on whether x is 1d or 2d numpy array, respectively.

Parameters:
  • x – scalar, or 1D or 2D numpy array of binary values (0/10) or bytes. If x is 2D array, each COLUMN is regarded as one instance of the random variable, and the gradient is evaluated for each column. If you want rows to be regarded as random variable, switch batch_as_column to ``

  • n – non-negative integer defining the sum to condition on.

  • batch_as_column – Only used if x is 2d array. if Ture, and x is two dimensional, each column is regarded as instance of the random variable (default), otherwise, each row is taken as a random variable.

Returns:

gradient (or batch of gradients) of the log-probabiltiy of the GCB model

Raises:

TypeError if the passed x has wrong shape/size and/or n is not non-negative integer

grad_pmf(x, batch_as_column=False)[source]#

Calculate the gradient of the probability mass function (PMF) of a conditional Bernoulli distribution, evaluated at given binary state x, and with parameters theta. The variable x is conditioned by the registered budget.

Parameters:
  • x – scalar, or 1D numpy array, or binary values (0/10) or bytes

  • n – non-negative integer defining the sum to condition on.

Returns:

gradient of the probabiltiy of the CB model

Raises:

TypeError if the passed x has wrong shape/size and/or n is not non-negative integer

sample(sample_size=1, antithetic=False, dtype=<class 'bool'>)[source]#

Sample a Condional Bernoulli random variable (1d or multivariate) according to probability of success p, that is :\(p:=(P(x=1))\), of the underlying Bernoulli random variable. If antithetic is True, sample_size must be even. The random variable is conditioned by the registered budgets.

Note

This is similar to ConditionalBernoulli.sample() except that we replace n with the registered budgets. To sample, we first sample sizes based on proabilities of each budget, and then sample the CB model conditioned by each sample size.

Parameters:
  • n – non-negative integer defining the sum to condition on.

  • sample_size (int) – size of the sample to generate

  • antithetic (bool)

  • dtype (type) – data type of the returned array

  • random_seedNone|int dictates the random seed to be used to initialize the underlying random number generator

Returns:

bernoulli_sample: array of shape sample_size x n where n is the size of p

Raises:

ValueError if the sample_size is not a positive integer or if no proper budget registered with nonzero probabilities.

Note

If p is scalar or iterable of length 1, this will be 1d array of size=sample_size. Otherwise, if p is multivariate, this will be 2d array with each row representing one sample.

uniform_random_sample(sample_size=1, antithetic=False, dtype=<class 'bool'>)[source]#

Generate sample(s) from the created distribution with parameter setup to yield equal probabilities of all points in the distribution support. This is usefule for generating uninformative (completely blind) random samples from a given distribution. For example, a Bernoulli distribution can provide samples where the probability of sucess is set to 0.5.

expect(func, objective_value_tracker=None)[source]#

Calculate the expected value of a function (func) which accepts w as parameter

property conditional_bernoulli_model#

Return a reference to the underlying Conditional Bernoulli Model

property budgets#

Copy of the budget sizes list

property budgets_probabilities#

Copy of the budget sizes probabilities

Trajectory Distribution (Markovian)#

This module provides implementation of multiple flavors of the probabilistic policy model implementing distribution of a path on a navigation mesh for OED applications.

class TrajectoryDistributionConfigs(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', name='Markovian Trajectory Distribution', random_seed=None, connectivity_matrix=None, grid=None, parameter=None, trajectory_length=1)[source]#

Bases: DistributionConfigs

Configurations class for the TrajectoryDistribution abstract base class. This class inherits functionality from DistributionConfigs and only adds new class-level variables which can be updated as needed.

See DistributionConfigs for more details on the functionality of this class along with a few additional fields. Otherwise TrajectoryDistribution provides the following fields:

Parameters:
  • verbose (bool) – a boolean flag to control verbosity of the object.

  • debug (bool) – a boolean flag that enables adding extra functionality in a debug mode

  • output_dir (str | Path) – the base directory where the output files will be saved.

  • random_seed (int | None) – random seed used for pseudo random number generation

  • connectivity_matrix (ndarray | spmatrix | None) – 2D array defnining the connectivity structure (connected/not) of possible indexes. The shape of this array/matrix is \(n \times n\) where n is the number of candidate locations (e.g, network vertixes). This is very much like connectivity structure in a network. If the [i, j] entry of this array is True, then the ith candidate location (e.g., vertex/location) is connected (direct path) with jth candidate locations.

  • grid (ndarray | None) – an optional 2d array containing the unique coordinates of the gridpoints corresponding to the rows in connectivity_matrix. This is added only so that the user can visually inspect the grid and plot trajectories if needed. If this is passed it must be a two dimensional array with each row representing a grid point. The gridpoints themselves live in spatial coordinates of dimension equal to the number of columns in the grid. So, if the grid is n times m, then the dimension of connectivity_matrix must be \(n \times n\) and the gridpoints (network vertices) live in a m dimensional space.

  • parameter (ndarray | Iterable[float] | None) – probability of success of the bernoulli trials. This determins the dimension of the probability distribution

  • trajectory_length (int) – the intended trajectory length. This can be easily modified, but it is cruicial to define the dimension of the underlying distribution/space. THE TRAJECTORY LENGTH HERE IS DEFINED AS THE NUMBER OF EDGES ON A PATH/TRAJECTORY

  • name (str) – name of the distribution .. note:: The trajectory length here is defined as the number of edges on a path/trajectory

connectivity_matrix: ndarray | spmatrix | None#
grid: ndarray | None#
parameter: ndarray | Iterable[float] | None#
trajectory_length: int#
__init__(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', name='Markovian Trajectory Distribution', random_seed=None, connectivity_matrix=None, grid=None, parameter=None, trajectory_length=1)#
class TrajectoryDistribution(configs=None)[source]#

Bases: Distribution

An implementation of the Distribution of a Trajectory over a network where the trajectory is described by an initial distribution and a Markovian transition matrix.

Parameters:

configs (dict | TrajectoryDistributionConfigs | None) – (optional) configurations for the model

__init__(configs=None)[source]#

Initialize the random number generator

validate_configurations(configs, raise_for_invalid=True)[source]#

Validation stage for the the passed configs.

Parameters:

configs (dict | TrajectoryDistributionConfigs) – configurations to validate. If a TrajectoryDistributionConfigs instance is passed, validation is performed on the entire set of configurations. However, if a dictionary is passed, validation is performed only on the configurations corresponding to the keys in the dictionary.

Raises:
update_configurations(**kwargs)[source]#

Take any set of keyword arguments, and lookup each in the configurations, and update as nessesary/possible/valid

update_connectivity_matrix(connectivity_matrix, parameter=None)[source]#

Update the underlying connectivity matrix.

Parameters:
  • connectivity_matrix (ndarray | spmatrix) – the connectivity matrix

  • parameter (Iterable[float] | None) – the new distribution parameter (initial distribution & transition parameters).

:raises PyOEDConfigsValidationError if either of the arguments if invalid

update_parameter(parameter)[source]#

Update the distribution parameters (initial and transition parameters)

validate_parameter(parameter)[source]#

Check validity of the parameter and return a validatded version.

Raises:

TypeError – if the parameter is invalid

split_parameter(parameter)[source]#

Convert a parameter (1d array) to initial parameters and list of transition parameters based on the underlying connectivity matrix

validate_trajectory(trajectory)[source]#

Check validity of the trajectory and return a validatded version.

Raises:

TypeError – if the trajectory is invalid

pmf(trajectory, parameter=None)[source]#

Trajectory is 1d array of indexes of the sensor to activate at each time.

log_pmf(trajectory, parameter=None, batch_as_column=False)[source]#

Gradient of the logarithm of the PMF of a trajectory

grad_log_pmf(trajectory, parameter=None, batch_as_column=False)[source]#

Gradient of the logarithm of the PMF of a trajectory

grad_log_pmf_variance(parameter=None, trajectory_length=None)[source]#

Variance (total) of the gradient of the logarithm of the PMF of a trajectory This is useful for statistical analysis and probabilistic optimization approaches.

grad_pmf(trajectory, parameter=None, batch_as_column=False)[source]#
initial_probability(i=None, parameter=None)[source]#

Calculate the probability of starting at a specific index of the number of candidates

transition_probability(row, column, parameter=None)[source]#

Calculate the probability of transitioning from candidate i to candidate j

sample(trajectory_length=None, parameter=None, sample_size=1)[source]#

Sample a trajectory or more given transition parameter

uniform_random_sample(trajectory_length=None, sample_size=1, parameter=None)[source]#

Generate sample(s) from the created distribution with parameter setup to yield equal probabilities of all points in the distribution support. This is usefule for generating uninformative (completely blind) random samples from a given distribution. For example, a Bernoulli distribution can provide samples where the probability of sucess is set to 0.5.

onestep_bruteforce(bruteforce, parameter=None)[source]#

Generate bruteforce results from partial bruteforce results be appending bruteforce based on last step

bruteforce(trajectory_length=None, parameter=None)[source]#

Generate all possible trajectories of a given length.

property parameter_initial_indexes#

A lazy property that generates a 1d array of integer indexes associated with each row in the connectivity matrix defining initial index in the parameter vector for each row

property conditional_Bernoulli_model#

The conditional Bernoulli model used for computations of pmf and grad pmf with budget 1

property trajectory_length#

Parameter size

property parameter_size#

Parameter size

property connectivity_matrix#
property parameter#
property size#

Return the dimension/size of the underlying probability space.

Return type:

int

property num_candidates#
property initial_parameters#
property initial_probabilities#

List of initial probabilities corresponding to initial distribution

property transition_parameters#
property transition_parameters_matrix#

Full transition parameters matrix filled up with transition parameters.

property transition_probabilities#

List of transition probabilities corresponding to connectivity indexes

property transition_probability_matrix#

Full transition probability matrix filled up with transition probabilities.

property connectivity_indexes#
property default_parameter#

Parameter for first-order model (ignoring any higher order)

class HigherOrderTrajectoryDistributionConfigs(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', name='Higher Order Markovian Trajectory Distribution', random_seed=None, connectivity_matrix=None, grid=None, parameter=None, trajectory_length=1, order=1, lag_dependent_transitions=False, freeze_lag_weights=False)[source]#

Bases: TrajectoryDistributionConfigs

Configurations class for the HigherOrderTrajectoryDistribution class. This class inherits functionality from TrajectoryDistributionConfigs and only adds new class-level variables which can be updated as needed.

Note

This model relys on Raftery’s model and its extension.

Note

The trajectory length here is defined as the number of edges on a path/trajectory, and it must be greater that on or equal to the order.

Parameters:
  • verbose (bool) – a boolean flag to control verbosity of the object.

  • debug (bool) – a boolean flag that enables adding extra functionality in a debug mode

  • output_dir (str | Path) – the base directory where the output files will be saved.

  • random_seed (int | None) – random seed used for pseudo random number generation

  • connectivity_matrix (ndarray | spmatrix | None) – 2D array defnining the connectivity structure (connected/not) of possible indexes. The shape of this array/matrix is \(n \times n\) where n is the number of candidate locations (e.g, network vertixes). This is very much like connectivity structure in a network. If the [i, j] entry of this array is True, then the ith candidate location (e.g., vertex/location) is connected (direct path) with jth candidate locations.

  • grid (ndarray | None) – an optional 2d array containing the unique coordinates of the gridpoints corresponding to the rows in connectivity_matrix. This is added only so that the user can visually inspect the grid and plot trajectories if needed. If this is passed it must be a two dimensional array with each row representing a grid point. The gridpoints themselves live in spatial coordinates of dimension equal to the number of columns in the grid. So, if the grid is n times m, then the dimension of connectivity_matrix must be \(n \times n\) and the gridpoints (network vertices) live in a m dimensional space.

  • parameter (ndarray | Iterable[float] | None) – probability of success of the bernoulli trials. This determins the dimension of the probability distribution

  • trajectory_length (int) – the intended trajectory length. This can be easily modified, but it is cruicial to define the dimension of the underlying distribution/space. THE TRAJECTORY LENGTH HERE IS DEFINED AS THE NUMBER OF EDGES ON A PATH/TRAJECTORY

  • name (str) – name of the distribution .. note:: The trajectory length here is defined as the number of edges on a path/trajectory

  • order (int) – the order of the Markov Chain model used to model the chain with memory.

  • lag_dependent_transitions (bool) – if True, the generalized Raftery model (transition matrices vary between lags) is used. If False, the same transition matrix is used for all lags.

  • freeze_lag_weights (bool) – If True, the lag weights are regarded as constants. This means the derivative of the distribibution with respect to those is set to zero, and are thus not updated by any optimization procecure.

order: int#
lag_dependent_transitions: bool#
freeze_lag_weights: bool#
__init__(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', name='Higher Order Markovian Trajectory Distribution', random_seed=None, connectivity_matrix=None, grid=None, parameter=None, trajectory_length=1, order=1, lag_dependent_transitions=False, freeze_lag_weights=False)#
class HigherOrderTrajectoryDistribution(configs=None)[source]#

Bases: TrajectoryDistribution

An implementation of the Distribution of a Trajectory over a network where the trajectory is described by an initial distribution and a Markovian transition matrix.

Parameters:

configs (dict | HigherOrderTrajectoryDistributionConfigs | None) – (optional) configurations for the model

__init__(configs=None)[source]#

Initialize the random number generator

validate_configurations(configs, raise_for_invalid=True)[source]#

Validation stage for the the passed configs.

Parameters:

configs (dict | HigherOrderTrajectoryDistributionConfigs) – configurations to validate. If a HigherOrderTrajectoryDistributionConfigs instance is passed, validation is performed on the entire set of configurations. However, if a dictionary is passed, validation is performed only on the configurations corresponding to the keys in the dictionary.

Raises:
update_configurations(**kwargs)[source]#

Take any set of keyword arguments, and lookup each in the configurations, and update as nessesary/possible/valid

update_parameter(parameter)[source]#

Update the distribution parameters (initial and transition parameters)

validate_parameter(parameter)[source]#

Check validity of the parameter and return a validatded version.

Raises:

TypeError – if the parameter is invalid

split_parameter(parameter)[source]#

Convert a parameter (1d array) to initial parameters and list of transition parameters based on the underlying connectivity matrix

validate_trajectory(trajectory)[source]#

Check validity of the trajectory and return a validatded version.

Raises:

TypeError – if the trajectory is invalid

generate_all_paths(i, j, length, parameter=None)[source]#

Generate all candidate paths between two points i and j

nstep_transition_probability(i, j, n, parameter=None)[source]#

The probability of going from point i to point j in n steps. This is the total probability of all candidate paths of length (number of nodes) n+1 from i to j.

Warning

This is way too slow when n is large. One should use the transition probability matrix by calling self.transition_probability_matrix(), then do matrix vector multiplication of the transposed transiton matrix recursively n times with with a cardinality vector e_i, and then extract the jth component

transition_probability(row=None, column=None, parameter=None, order=1, P=None)[source]#

Construct and return the transition probability matrix from the parameter (passed or default) If a specific row is passed, only values held in that row are returned. If a specific column is passed (row must be given) the value of the transition probability from row to column indexes is returned

Note

order refers to n in the n-step transition probability. Thus, it has to be positive integer.

Warning

Any internal method that repeatedly all transition_probability will definitely be slow since for every iteration the full first-order-transition matrix can be constructed. Thus, if any method is calling this internally, the implementation should be copied and/or the first-order probability matrix should be cashed

transition_probability_gradient(row, column, parameter=None)[source]#

Evaluate the gradient of the transition probability from (row) to (column) with respect to model parameter.

conditional_transition_probability(trajectory, j, parameter=None, P=None)[source]#

Calculate the conditional probability of transitioning from candidate trajectory to candidate j

If P is passed, it MUST be 2d Array-like that contains first-order transition probabilities

conditional_transition_probability_gradient(trajectory, j, parameter=None, P=None, P_grad=None, return_P_grad=False)[source]#

Calculate the gradient of the conditional probability of transitioning from candidate trajectory to candidate j. The derivative/gradient is calculated with respect to the components of the parameter.

If P is passed, it MUST be 2d Array-like that contains first-order transition probabilities

sample(trajectory_length=None, parameter=None, sample_size=1)[source]#

Sample a trajectory or more given transition parameter

uniform_random_sample(trajectory_length=None, sample_size=1, parameter=None)[source]#

Generate sample(s) from the created distribution with parameter setup to yield equal probabilities of all points in the distribution support. This is usefule for generating uninformative (completely blind) random samples from a given distribution. For example, a Bernoulli distribution can provide samples where the probability of sucess is set to 0.5.

higher_order_onestep_bruteforce(bruteforce, parameter=None)[source]#

Generate bruteforce results from partial bruteforce results be appending bruteforce based on last step

onestep_bruteforce(bruteforce, parameter=None)[source]#

Generate bruteforce results from partial bruteforce results be appending bruteforce based on last step

bruteforce(trajectory_length=None, parameter=None)[source]#

Generate all possible trajectories of a given length.

property first_order_parameter_size#
property parameter_size#

Parameter size

property default_first_order_parameter#
property default_parameter#

Parameter for first-order model (ignoring any higher order)

property order#

Lag/order of the higher-order Markov Model

property initial_policy#

A lazy object to evaluate the initial policy values

property lag_parameters#
property lag_dependent_transitions#
property freeze_lag_weights#
create_higher_order_model(trajectory_length=3, order=1, fully_connected=False)[source]#

Combinatorial Functions#

This module provides access to useful combinatorial functions and tools.

class RFunctionConfigs(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', name='R-Function', method='tabulation')[source]#

Bases: PyOEDConfigs

Configurations class for the RFunction abstract base class. This class inherits functionality from PyOEDConfigs and only adds new class-level variables which can be updated as needed.

See PyOEDConfigs for more details on the functionality of this class along with a few additional fields. Otherwise RFunction provides the following fields:

Parameters:
  • verbose (bool) – a boolean flag to control verbosity of the object.

  • debug (bool) – a boolean flag that enables adding extra functionality in a debug mode

  • output_dir (str | Path) – the base directory where the output files will be saved.

  • name (str) – name of the class. Default is ‘R-Function’.

  • method (str) –

    the method to use for calculating the R-function and its derivative. Only two values are accepted:

    • ’recursion’: The first method with closed form-recurrence relation is used.

    • ’tabulation’: (default) The second method where values of the R-function and derivatives are tabulated row-by-row.

name: str#
method: str#
__init__(*, debug=False, verbose=False, output_dir='./_PYOED_RESULTS_', name='R-Function', method='tabulation')#
class RFunction(configs=None)[source]#

Bases: PyOEDObject

Implementations of the R-function along with its derivatives. The code here provides two methods to calculating the value of R-function \(R(n, S)\) for a given set of weights \((w_1, w_2, w_{N})\), where \(S:=\{1, 2, \ldots, N\}\).

Parameters:

configs (dict | RFunctionConfigs | None) – (optional) configurations for the R-function. Configurations are ported from RFunctionConfigs.

__init__(configs=None)[source]#
validate_configurations(configs, raise_for_invalid=True)[source]#

Each simulation model SHOULD implement it’s own function that validates its own configurations. If the validation is self contained (validates all configuations), then that’s it. However, one can just validate the configurations of of the immediate class and call super to validate configurations associated with the parent class.

If one does not wish to do any validation (we strongly advise against that), simply add the signature of this function to the model class.

Note

The purposed of this method is to make sure that the settings in the configurations object self._CONFIGURATIONS are of the right type/values and are conformable with each other. This function is called upon instantiation of the object, and each time a configuration value is updated. Thus, this function need to be inexpensive and should not do heavy computations.

Parameters:

configs (dict | RFunctionConfigs) – configurations to validate. If a RFunctionConfigs object is passed, validation is performed on the entire set of configurations. However, if a dictionary is passed, validation is performed only on the configurations corresponding to the keys in the dictionary.

Raises:
  • PyOEDConfigsValidationError – if the configurations are invalid and raise_for_invalid is set to True.

  • AttributeError – if any (or a group) of the configurations does not exist in the model configurations ToyLinearTimeIndependentConfigs.

calculate_w(p, dtype=<class 'decimal.Decimal'>, log=False, undefined_as_nan=False)[source]#

Calculate Bernoulli weights w from success probabilities p. The weights are defined as:

\[w_i = \frac{p_i}{1-p_i}\]

Note

The weights cannot be evaluated for any value of p equal to 1.

Parameters:
  • p (Iterable[float]) – a sequence of success probabilities.

  • dtype (type) – Data type (must be a callable to transform into the input to the desired data type)

  • log (bool) – return the logarithm of w if True

  • undefined_as_nan (bool) – if True set the value of w to nan for any value of the probability outside the domain [0, 1)]

Returns:

a sequence (of the same length as p) with weights of type :py:class`decimal.Decimal`

Raises:

ValueError – if any of the probabilities are no in the interval [0, 1) and undefined_as_nan is False

Return type:

Iterable[Decimal | float]

evaluate(n, w, log=False, dtype=<class 'decimal.Decimal'>)[source]#

Evaluate the value of the R-function \(R(n, S)\) where \(S=\{1, 2, \ldots, N\}\), with \(N\) being the length/size of the weights vector \(w\).

Note

This method is a wrapper that calls either evaluate_by_recursion() or evaluate_by_tabulation() based on the registered evaluation method.

Parameters:
  • n (int) – an integer which defines the first argument of the R-function.

  • w (iterable) – the vector of weights derived from Bernoulli trials parameters..

  • log (bool) – if True return the logarithm (natural logarithm) log(R(n, S)), otherwise return the value of R(n, S)

Returns:

the value(s) of R(n, S) either as a scalar (if n is not None) or a sequence if n is None.

Return type:

decimal.Decimal or a list of decimal.Decimal values

evaluate_by_recursion(n, w, log=False, dtype=<class 'decimal.Decimal'>, enforce_non_negative_R=False)[source]#

Evaluate the value of the R-function \(R(n, S)\) where \(S=\{1, 2, \ldots, N\}\), with \(N\) being the length/size of the weights vector \(w\). Here, the recursion method is used.

Calculate the R(n, S) function value, where \(S:=\{1, 2, \ldots,N \}\) where \(N\) is the length/size of w, and w is the weights vector calculated from the probability of success of a multivariate Bernoulli ditribution \(\theta\) as \(w=\frac{\theta}{1-\theta}\). The R-funciton is given by:

\[R(z, S) := \sum_{B\in A\,; |B|=k} \prod_{i\in B} w_i \,;\, w_i := \frac{\theta_i }{1-\theta_i}\]

This is calculated using the recurrence relation

\[R(z, S) = \frac{1}{z} \sum_{i=1}^{z} (-1)^{i+1} T(i, S) R(z-i, S)\,;\, T(i, S) := \sum_{j\in S} w_j^{i}\]

Warning

This method is numerically unstable, especially for larg values of Ns!

Note

The R function returns very high numbers for large dimensions (nature of combinatorics), and thus one shouldn’t use numpy arrays to store such values. We have to use native Python numbers (and store things in lists).

Parameters:
  • n (int) – an integer which defines the first argument of the R-function.

  • w (iterable) – the vector of weights derived from Bernoulli trials parameters..

  • log (bool) – if True return the logarithm (natural logarithm) log(R(n, S)), otherwise return the value of R(n, S)

Returns:

the value(s) of R(n, S) either as a scalar (if n is not None) or a sequence if n is None.

Return type:

decimal.Decimal or a list of decimal.Decimal values

Raises:

TypeError n is not integer or w is of unrecognized type

evaluate_by_tabulation(n, w, log=False, dtype=<class 'decimal.Decimal'>)[source]#

Evaluate the value of the R-function \(R(n, S)\) (or its logarithm) where \(S=\{1, 2, \ldots, N\}\), with \(N\) being the length/size of the weights vector \(w\). Here, the recursion method is used.

Calculate the R(n, S) function value, where \(S:=\{1, 2, \ldots,N \}\) where \(N\) is the length/size of w, and w is the weights vector calculated from the probability of success of a multivariate Bernoulli ditribution \(\theta\) as \(w=\frac{\theta}{1-\theta}\). The R-funciton is given by:

\[R(z, S) := \sum_{B\in A\,; |B|=k} \prod_{i\in B} w_i \,;\, w_i := \frac{\theta_i }{1-\theta_i}\]

This is calculated using a tabulated relationship with c(i, j) entry of the table calculated by the following recurrence relation

\[c(i, j) = \frac{1}{z} \sum_{i=1}^{z} (-1)^{i+1} T(i, S) R(z-i, S)\,;\, T(i, S) := \sum_{j\in S} w_j^{i} \, i<=j \,, i=0, 1, \ldots, N\]

and \(R(z, S)\) is the value in the cell c(z, ) and N is the cardinality of S:={1, 2,ldots, N}`.

Note

The R function returns very high numbers for large dimensions (nature of combinatorics), and thus one can’t use numpy arrays to store such values. We have to use native Python numbers (and store things in lists).

Warning

This method is numerically unstable, especially for larg values of Ns!

Note

The R function returns very high numbers for large dimensions (nature of combinatorics), and thus one shouldn’t use numpy arrays to store such values. We have to use native Python numbers (and store things in lists).

Parameters:
  • n (int|None) – if an integer passed it must be in the interval [0, N] where N is the size/dimension of the probaility distribution. If None, the values of R(n, S) for all possible values of n are returned.

  • w (iterable) – the vector of weights derived from Bernoulli trials parameters..

  • log (bool) – if True return the logarithm (natural logarithm) log(R(n, S)), otherwise return the value of R(n, S)

Returns:

the value(s) of R(n, S) either as a scalar (if n is not None) or a sequence if n is None.

Return type:

decimal.Decimal or a list of decimal.Decimal values

Raises:

TypeError n is not integer or w is of unrecognized type

Raises:

ValueError if any of the weights in w fall outside the interval [0, 1].

gradient(n, w, dtype=<class 'decimal.Decimal'>, log=False)[source]#

Evaluate the gradient of the R-function \(R(n, S)\), or its logarithm, with respect to the weights w.

Note

This method is a wrapper that calls either gradient_by_recursion() or gradient_by_tabulation() based on the registered evaluation method.

gradient_by_recursion(n, w, dtype=<class 'decimal.Decimal'>, log=False, enforce_non_negative_R=False)[source]#

Evaluate the gradient of the R-function \(R(n, S)\) with respect to the weights w. This method returns the derivative of the result generated by evaluate_by_recursion(), and accepts the same arguments.

Note

If log is True this function returns the gradient of the logarithm of the R-function. This is simply evaluated by applying the rule of derivative of the logarithm.

gradient_by_tabulation(n, w, dtype=<class 'decimal.Decimal'>, log=False)[source]#

Evaluate the gradient of the R-function \(R(n, S)\) (or its logarithm) with respect to the weights w. This method returns the derivative of the result generated by evaluate_by_tabulation(), and accepts the same arguments.

property verbose#

Screen verbosity of the model

property method#

Return the name of the evaluation method used