mici.adapters module#

Methods for adaptively setting algorithmic parameters of transitions.

class mici.adapters.Adapter[source]#

Bases: ABC

Abstract adapter for implementing schemes to adapt transition parameters.

Adaptation schemes are assumed to be based on updating a collection of adaptation variables (collectively termed the adapter state here) after each chain transition based on the sampled chain state and/or statistics of the transition such as an acceptance probability statistic. After completing a chain of one or more adaptive transitions, the final adapter state may be used to perform a final update to the transition parameters.

abstract finalize(adapt_states, chain_states, transition, rngs)[source]#

Update transition parameters based on final adapter state or states.

Optionally, if multiple adapter states are available, e.g. from a set of independent adaptive chains, then these adaptation information from all the chains may be combined to set the transition parameter(s).

Parameters:
  • adapt_states (Union[AdapterState, Iterable[AdapterState]]) – Final adapter state or a list of per chain adapter states. Arrays / buffers associated with the adapter state entries may be recycled to reduce memory usage - if so the corresponding entries will be removed from the adapter state dictionary / dictionaries.

  • chain_states (Union[ChainState, Iterable[ChainState]]) – Final state of chain (or states of chains) in current sampling stage. May be updated in-place if transition parameters altered by adapter require updating any state components.

  • transition (Transition) – Markov transition being dapted. Attributes of the transition or child objects will be updated in-place by the method.

  • rngs (Union[Generator, Iterable[Generator]]) – Random number generator for the chain or a list of per-chain random number generators. Used to resample any components of states needing to be updated due to adaptation if required.

abstract initialize(chain_state, transition)[source]#

Initialize adapter state prior to starting adaptive transitions.

Parameters:
  • chain_state (ChainState) – Initial chain state adaptive transition will be started from. May be used to calculate initial adapter state but should not be mutated by method.

  • transition (Transition) – Markov transition being adapted. Attributes of the transition or child objects may be updated in-place by the method.

Returns:

Initial adapter state.

Return type:

AdapterState

abstract property is_fast: bool#

Whether the adapter is ‘fast’ or ‘slow’.

An adapter which requires only local information to adapt the transition parameters should be classified as fast while one which requires more global information and so more chain iterations should be classified as slow i.e. is_fast == False.

abstract update(adapt_state, chain_state, trans_stats, transition)[source]#

Update adapter state after sampling from transition being adapted.

Parameters:
  • adapt_state (AdapterState) – Current adapter state. Entries will be updated in-place by the method.

  • chain_state (ChainState) – Current chain state following sampling from transition being adapted. May be used to calculate adapter state updates but should not be mutated by method.

  • trans_stats (TransitionStatistics) – Dictionary of statistics associated with transition being adapted. May be used to calculate adapter state updates but should not be mutated by method.

  • transition (Transition) – Markov transition being adapted. Attributes of the transition or child objects may be updated in-place by the method.

class mici.adapters.DualAveragingStepSizeAdapter(adapt_stat_target=0.8, adapt_stat_func=None, log_step_size_reg_target=None, log_step_size_reg_coefficient=0.05, iter_decay_coeff=0.75, iter_offset=10, max_init_step_size_iters=100, log_step_size_reducer=None)[source]#

Bases: Adapter

Dual averaging integrator step size adapter.

Implementation of the dual algorithm step size adaptation algorithm described in Hoffman and Gelman (2014), a modified version of the stochastic optimisation scheme of Nesterov (2009). By default the adaptation is performed to control the accept_stat statistic of an integration transition to be close to a target value but the statistic adapted on can be altered by changing the adapt_stat_func.

References

  1. Hoffman, M.D. and Gelman, A. (2014). The No-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15(1), pp.1593-1623.

  2. Nesterov, Y. (2009). Primal-dual subgradient methods for convex problems. Mathematical programming 120(1), pp.221-259.

Parameters:
  • adapt_stat_target (float) – Target value for the transition statistic being controlled during adaptation.

  • adapt_stat_func (Optional[AdaptationStatisticFunction]) – Function which given a dictionary of transition statistics outputs the value of the statistic to control during adaptation. By default this is set to a function which simply selects the :code:’accept_stat’ value in the statistics dictionary.

  • log_step_size_reg_target (Optional[float]) – Value to regularize the controlled output (logarithm of the integrator step size) towards. If None set to log(10 * init_step_size) where init_step_size is the initial ‘reasonable’ step size found by a coarse search as recommended in Hoffman and Gelman (2014). This has the effect of giving the dual averaging algorithm a tendency towards testing step sizes larger than the initial value, with typically integrating with a larger step size having a lower computational cost.

  • log_step_size_reg_coefficient (float) – Coefficient controlling amount of regularisation of controlled output (logarithm of the integrator step size) towards log_step_size_reg_target. Defaults to 0.05 as recommended in Hoffman and Gelman (2014).

  • iter_decay_coeff (float) – Coefficient controlling exponent of decay in schedule weighting stochastic updates to smoothed log step size estimate. Should be in the interval (0.5, 1] to ensure asymptotic convergence of adaptation. A value of 1 gives equal weight to the whole history of updates while setting to a smaller value increasingly highly weights recent updates, giving a tendency to ‘forget’ early updates. Defaults to 0.75 as recommended in Hoffman and Gelman (2014).

  • iter_offset (int) – Offset used for the iteration based weighting of the adaptation statistic error estimate. Should be set to a non-negative value. A value > 0 has the effect of stabilising early iterations. Defaults to the value of 10 as recommended in Hoffman and Gelman (2014).

  • max_init_step_size_iters (int) – Maximum number of iterations to use in initial search for a reasonable step size with an mici.errors.AdaptationError exception raised if a suitable step size is not found within this many iterations.

  • log_step_size_reducer (Optional[ReducerFunction]) – Reduction to apply to final per-chain step sizes estimates to produce overall integrator step size for main chain stages. The specified function should accept a sequence of logarithms of estimated step sizes and output a non-negative step size to use. If None, the default, a function which computes the arithmetic mean of the per-chain step sizes is used.

finalize(adapt_states, chain_states, transition, rngs)[source]#

Update transition parameters based on final adapter state or states.

Optionally, if multiple adapter states are available, e.g. from a set of independent adaptive chains, then these adaptation information from all the chains may be combined to set the transition parameter(s).

Parameters:
  • adapt_states (Union[AdapterState, Iterable[AdapterState]]) – Final adapter state or a list of per chain adapter states. Arrays / buffers associated with the adapter state entries may be recycled to reduce memory usage - if so the corresponding entries will be removed from the adapter state dictionary / dictionaries.

  • chain_states (Union[ChainState, Iterable[ChainState]]) – Final state of chain (or states of chains) in current sampling stage. May be updated in-place if transition parameters altered by adapter require updating any state components.

  • transition (Transition) – Markov transition being dapted. Attributes of the transition or child objects will be updated in-place by the method.

  • rngs (Union[Generator, Iterable[Generator]]) – Random number generator for the chain or a list of per-chain random number generators. Used to resample any components of states needing to be updated due to adaptation if required.

initialize(chain_state, transition)[source]#

Initialize adapter state prior to starting adaptive transitions.

Parameters:
  • chain_state (ChainState) – Initial chain state adaptive transition will be started from. May be used to calculate initial adapter state but should not be mutated by method.

  • transition (Transition) – Markov transition being adapted. Attributes of the transition or child objects may be updated in-place by the method.

Returns:

Initial adapter state.

Return type:

AdapterState

is_fast = True#
update(adapt_state, chain_state, trans_stats, transition)[source]#

Update adapter state after sampling from transition being adapted.

Parameters:
  • adapt_state (AdapterState) – Current adapter state. Entries will be updated in-place by the method.

  • chain_state (ChainState) – Current chain state following sampling from transition being adapted. May be used to calculate adapter state updates but should not be mutated by method.

  • trans_stats (TransitionStatistics) – Dictionary of statistics associated with transition being adapted. May be used to calculate adapter state updates but should not be mutated by method.

  • transition (Transition) – Markov transition being adapted. Attributes of the transition or child objects may be updated in-place by the method.

class mici.adapters.OnlineCovarianceMetricAdapter(reg_iter_offset=5, reg_scale=0.001)[source]#

Bases: Adapter

Dense metric adapter using online covariance estimates.

Uses Welford’s algorithm (Welford, 1962) to stably compute an online estimate of the sample covariance matrix of the chain state position components during sampling. If online estimates are available from multiple independent chains, the final covariance matrix estimate is calculated from the per-chain statistics using a covariance variant due to Schubert and Gertz (2018) of the parallel / batched incremental variance algorithm described by Chan et al. (1979). The covariance matrix estimates are optionally regularized towards a scaled identity matrix, with increasing weight for small number of samples, to decrease the effect of noisy estimates for small sample sizes, following the approach in Stan (Carpenter et al., 2017). The metric matrix representation is set to a dense positive definite matrix corresponding to the inverse of the (regularized) covariance matrix estimate.

References

  1. Welford, B. P. (1962). Note on a method for calculating corrected sums of squares and products. Technometrics, 4(3), pp. 419-420.

  2. Schubert, E. and Gertz, M. (2018). Numerically stable parallel computation of (co-)variance. ACM. p. 10. doi:10.1145/3221269.3223036.

  3. Chan, T. F., Golub, G. H. and LeVeque, R. J. (1979). Updating formulae and a pairwise algorithm for computing sample variances. Technical Report STAN-CS-79-773, Department of Computer Science, Stanford University.

  4. Carpenter, B., Gelman, A., Hoffman, M.D., Lee, D., Goodrich, B., Betancourt, M., Brubaker, M., Guo, J., Li, P. and Riddell, A. (2017). Stan: A probabilistic programming language. Journal of Statistical Software, 76(1).

Parameters:
  • reg_iter_offset (int) – Iteration offset used for calculating iteration dependent weighting between regularisation target and current covariance estimate. Higher values cause stronger regularisation during initial iterations.

  • reg_scale (float) – Positive scalar defining value variance estimates are regularized towards.

finalize(adapt_states, chain_states, transition, rngs)[source]#

Update transition parameters based on final adapter state or states.

Optionally, if multiple adapter states are available, e.g. from a set of independent adaptive chains, then these adaptation information from all the chains may be combined to set the transition parameter(s).

Parameters:
  • adapt_states (Union[AdapterState, Iterable[AdapterState]]) – Final adapter state or a list of per chain adapter states. Arrays / buffers associated with the adapter state entries may be recycled to reduce memory usage - if so the corresponding entries will be removed from the adapter state dictionary / dictionaries.

  • chain_states (Union[ChainState, Iterable[ChainState]]) – Final state of chain (or states of chains) in current sampling stage. May be updated in-place if transition parameters altered by adapter require updating any state components.

  • transition (Transition) – Markov transition being dapted. Attributes of the transition or child objects will be updated in-place by the method.

  • rngs (Union[Generator, Iterable[Generator]]) – Random number generator for the chain or a list of per-chain random number generators. Used to resample any components of states needing to be updated due to adaptation if required.

initialize(chain_state, transition)[source]#

Initialize adapter state prior to starting adaptive transitions.

Parameters:
  • chain_state (ChainState) – Initial chain state adaptive transition will be started from. May be used to calculate initial adapter state but should not be mutated by method.

  • transition (Transition) – Markov transition being adapted. Attributes of the transition or child objects may be updated in-place by the method.

Returns:

Initial adapter state.

Return type:

AdapterState

is_fast = False#
update(adapt_state, chain_state, trans_stats, transition)[source]#

Update adapter state after sampling from transition being adapted.

Parameters:
  • adapt_state (AdapterState) – Current adapter state. Entries will be updated in-place by the method.

  • chain_state (ChainState) – Current chain state following sampling from transition being adapted. May be used to calculate adapter state updates but should not be mutated by method.

  • trans_stats (TransitionStatistics) – Dictionary of statistics associated with transition being adapted. May be used to calculate adapter state updates but should not be mutated by method.

  • transition (Transition) – Markov transition being adapted. Attributes of the transition or child objects may be updated in-place by the method.

class mici.adapters.OnlineVarianceMetricAdapter(reg_iter_offset=5, reg_scale=0.001)[source]#

Bases: Adapter

Diagonal metric adapter using online variance estimates.

Uses Welford’s algorithm (Welford, 1962) to stably compute an online estimate of the sample variances of the chain state position components during sampling. If online estimates are available from multiple independent chains, the final variance estimate is calculated from the per-chain statistics using the parallel / batched incremental variance algorithm described by Chan et al. (1979). The variance estimates are optionally regularized towards a common scalar value, with increasing weight for small number of samples, to decrease the effect of noisy estimates for small sample sizes, following the approach in Stan (Carpenter et al., 2017). The metric matrix representation is set to a diagonal matrix with diagonal elements corresponding to the reciprocal of the (regularized) variance estimates.

References

  1. Welford, B. P. (1962). Note on a method for calculating corrected sums of squares and products. Technometrics, 4(3), pp. 419-420.

  2. Chan, T. F., Golub, G. H. and LeVeque, R. J. (1979). Updating formulae and a pairwise algorithm for computing sample variances. Technical Report STAN-CS-79-773, Department of Computer Science, Stanford University.

  3. Carpenter, B., Gelman, A., Hoffman, M.D., Lee, D., Goodrich, B., Betancourt, M., Brubaker, M., Guo, J., Li, P. and Riddell, A. (2017). Stan: A probabilistic programming language. Journal of Statistical Software, 76(1).

Parameters:
  • reg_iter_offset (int) – Iteration offset used for calculating iteration dependent weighting between regularisation target and current covariance estimate. Higher values cause stronger regularisation during initial iterations. A value of zero corresponds to no regularisation; this should only be used if the sample covariance is guaranteed to be positive definite.

  • reg_scale (float) – Positive scalar defining value variance estimates are regularized towards.

finalize(adapt_states, chain_states, transition, rngs)[source]#

Update transition parameters based on final adapter state or states.

Optionally, if multiple adapter states are available, e.g. from a set of independent adaptive chains, then these adaptation information from all the chains may be combined to set the transition parameter(s).

Parameters:
  • adapt_states (Union[AdapterState, Iterable[AdapterState]]) – Final adapter state or a list of per chain adapter states. Arrays / buffers associated with the adapter state entries may be recycled to reduce memory usage - if so the corresponding entries will be removed from the adapter state dictionary / dictionaries.

  • chain_states (Union[ChainState, Iterable[ChainState]]) – Final state of chain (or states of chains) in current sampling stage. May be updated in-place if transition parameters altered by adapter require updating any state components.

  • transition (Transition) – Markov transition being dapted. Attributes of the transition or child objects will be updated in-place by the method.

  • rngs (Union[Generator, Iterable[Generator]]) – Random number generator for the chain or a list of per-chain random number generators. Used to resample any components of states needing to be updated due to adaptation if required.

initialize(chain_state, transition)[source]#

Initialize adapter state prior to starting adaptive transitions.

Parameters:
  • chain_state (ChainState) – Initial chain state adaptive transition will be started from. May be used to calculate initial adapter state but should not be mutated by method.

  • transition (Transition) – Markov transition being adapted. Attributes of the transition or child objects may be updated in-place by the method.

Returns:

Initial adapter state.

Return type:

AdapterState

is_fast = False#
update(adapt_state, chain_state, trans_stats, transition)[source]#

Update adapter state after sampling from transition being adapted.

Parameters:
  • adapt_state (AdapterState) – Current adapter state. Entries will be updated in-place by the method.

  • chain_state (ChainState) – Current chain state following sampling from transition being adapted. May be used to calculate adapter state updates but should not be mutated by method.

  • trans_stats (TransitionStatistics) – Dictionary of statistics associated with transition being adapted. May be used to calculate adapter state updates but should not be mutated by method.

  • transition (Transition) – Markov transition being adapted. Attributes of the transition or child objects may be updated in-place by the method.

mici.adapters.arithmetic_mean_log_step_size_reducer(log_step_sizes)[source]#

Compute arithmetic mean of step sizes from their logs.

Parameters:

log_step_sizes (Collection[float]) – Logarithms of per-chain estimated step sizes.

Returns:

Arithmetic mean of estimated step sizes.

Return type:

float

mici.adapters.default_adapt_stat_func(stats)[source]#

Function to extract default statistic used for step-size adaptation.

Parameters:

stats (TransitionStatistics) – Dictionary of transition statistics.

Returns:

Acceptance statistic.

Return type:

float

mici.adapters.geometric_mean_log_step_size_reducer(log_step_sizes)[source]#

Compute geometric mean of step sizes from their logs.

Parameters:

log_step_sizes (Collection[float]) – Logarithms of per-chain estimated step sizes.

Returns:

Geometric mean of estimated step sizes.

Return type:

float

mici.adapters.min_log_step_size_reducer(log_step_sizes)[source]#

Compute minimum of step sizes from their logs.

Parameters:

log_step_sizes (Collection[float]) – Logarithms of per-chain estimated step sizes.

Returns:

Minimum of estimated step sizes.

Return type:

float