mici.interop module#

Utilities for interfacing with external probabilistic programming libraries.

mici.interop.construct_pymc_model_functions(model)[source]#

Construct functions for sampling from PyMC model using Mici.

Parameters:

model (Model) – PyMC model to construct functions for.

Returns:

Tuple (neg_log_dens, grad_neg_log_dens, trace_func) with neg_log_dens a function for evaluating negative logarithm of unnormalized posterior density associated with model, grad_neg_log_dens a function for evaluating gradient of neg_log_dens with respect to position array argument and trace_func a function which extract model parameter values from chain state for tracing during sampling.

Return type:

tuple[ScalarFunction, GradientFunction, TraceFunction]

mici.interop.construct_stan_model_functions(model)[source]#

Construct functions for sampling from Stan model using Mici.

Parameters:

model (stan.Model) – Stan model to construct functions for.

Returns:

Tuple (neg_log_dens, grad_neg_log_dens, trace_func) with neg_log_dens a function for evaluating negative logarithm of unnormalized posterior density associated with model, grad_neg_log_dens a function for evaluating gradient of neg_log_dens with respect to position array argument and trace_func a function which extract model parameter values from chain state for tracing during sampling.

Return type:

tuple[ScalarFunction, GradientFunction, TraceFunction]

mici.interop.convert_to_inference_data(traces, stats, energy_key='energy', lp_key='lp')[source]#

Convert Mici sample_chains output to arviz.InferenceData.

Parameters:
  • traces (dict[str, list[ArrayLike]]) – Traces output from Mici mici.samplers.MarkovChainMonteCarloMethod.sample_chains() call. A dictionary of variables traced over sampled chains with the dictionary keys the variable names and the values a list of arrays, one array per sampled chain, with the first array dimension corresponding to the draw index and any remaining dimensions, the variable dimensions.

  • stats (dict[str, list[ArrayLike]]) – Statistics output from Mici sample_chains call. A dictionary of chain statistics traced over sampled chains with the dictionary keys the statistics names and the values a list of arrays, one array per sampled chain, with the array dimension corresponding to the draw index.

  • energy_key (Optional[str]) – The key of an entry in the traces dictionary corresponding the value of the Hamiltonian energy for the accepted proposal (up to an additive constant). If present the corresponding values will be added to the sample_stats group of the returned InferenceData object.

  • lp_key (Optional[str]) – The key of an entry in the traces dictionary corresponding the value of the joint log posterior density for the model (up to an additive constant). If present the corresponding values will be added to the sample_stats group of the returned InferenceData object.

Returns:

ArviZ inference data object with traced chain data stored in the posterior group and additional chain statistics in the sample_stats group.

Return type:

InferenceData

mici.interop.get_stan_model_unconstrained_param_dim(model)[source]#

Get total dimension of unconstrained parameters in Stan model.

Parameters:

model (stan.Model) – Stan model to get dimension for.

Returns:

Non-negative integer specifying unconstrained parameter dimension.

Return type:

int

mici.interop.sample_pymc_model(draws=1000, *, tune=1000, chains=None, cores=None, random_seed=None, progressbar=True, init='auto', jitter_max_retries=10, return_inferencedata=False, model=None, target_accept=0.8, max_treedepth=10)[source]#

Generate approximate samples from posterior defined by a PyMC model.

Uses dynamic multinomial HMC algorithm in Mici with adaptive warm-up phase.

This function replicates the interface of the pymc.sample() function to allow using as a (partial) drop-in replacement.

Parameters:
  • draws (int) – The number of samples to draw.

  • tune (int) – Number of iterations to tune, defaults to 1000. Samplers adjust the step sizes, scalings or similar during tuning. Tuning samples will be drawn in addition to the number specified in the draws argument, and will be discarded.

  • chains (Optional[int]) – The number of chains to sample. Running independent chains is important for some convergence statistics and can also reveal multiple modes in the posterior. If :code::None, then set to either cores or 2, whichever is larger.

  • cores (Optional[int]) – The number of chains to run in parallel. If None, set to the number of CPU cores in the system, but at most 4.

  • random_seed (Optional[int]) – Seed for NumPy random number generator used for generating random variables while sampling chains. If None then generator will be seeded with entropy from operating system.

  • progressbar (bool) – Whether or not to display a progress bar.

  • init (Literal['auto', 'adapt_diag', 'jitter+adapt_diag', 'adapt_full']) –

    Initialization method to use. One of:

    • "adapt_diag": Start with a identity mass matrix and then adapt a diagonal based on the variance of the tuning samples. All chains use the test value (usually the prior mean) as starting point.

    • jitter+adapt_diag: Same as "adapt_diag", but add uniform jitter in [-1, 1] to the starting point in each chain. Also chosen if init="auto".

    • "adapt_full": Adapt a dense mass matrix using the sample covariances.

    • jitter+adapt_full: Same as "adapt_full", but add uniform jitter in [-1, 1] to the starting point in each chain.d

  • jitter_max_retries (int) – Maximum number of repeated attempts (per chain) at creating an initial matrix with uniform jitter that yields a finite probability. This applies to “jitter+adapt_diag” and "jitter+adapt_full" init methods.

  • return_inferencedata (bool) – Whether to return the traces as an arviz.InferenceData (True) object or a dict (False).

  • model (Optional[Model]) – PyMC model defining posterior distribution to sample from. May be None if function is called from within model context manager.

  • target_accept (float) – Target value for the acceptance statistic being controlled during adaptive warm-up.

  • max_treedepth (int) – Maximum depth to expand trajectory binary tree to in integrator transition. The maximum number of integrator steps corresponds to 2**max_treedepth.

Returns:

A dictionary or arviz.InferenceData object containing the sampled chain output. Dictionary output (when return_inferencedata=False) has string keys corresponding to the name of each traced variable in the model, with the values being the corresponding values of the variables traced across the chains as NumPy arrays, with the first dimension the chain index (of size equal to chains), the second dimension the draw index (of size equal to draws) and any remaining dimensions corresponding to the dimensions of the traced variable. If return_inferencedata=True an arviz.InferenceData object is instead returned with traced chain data stored in the posterior group and additional chain statistics in the sample_stats group.

Return type:

Union[InferenceData, dict[str, ArrayLike]]

mici.interop.sample_stan_model(model_code, data, *, num_samples=1000, num_warmup=1000, num_chains=4, save_warmup=False, metric='diag_e', stepsize=1.0, adapt_engaged=True, delta=0.8, gamma=0.05, kappa=0.75, t0=10, init_buffer=75, term_buffer=50, window=25, max_depth=10, seed=None, return_inferencedata=False)[source]#

Generate approximate samples from posterior defined by a Stan model.

Uses dynamic multinomial HMC algorithm in Mici with adaptive warm-up phase.

This function follows a similar argument naming scheme to the PyStan stan.model.Model.sample() method (which itself follows CmdStan) to allow using as a (partial) drop-in replacement.

Parameters:
  • model_code (str) – Stan program code describing a Stan model.

  • data (dict) – A Python dictionary or mapping providing the data for the model. Variable names are the keys and the values are their associated values.

  • num_samples (int) – A non-negative integer specifying the number of non-warm-up iterations per chain.

  • num_warmup (int) – A non-negative integer specifying the number of warm-up iterations per chain.

  • num_chains (int) – A positive integer specifying the number of Markov chains.

  • save_warmup (bool) – Whether to save warm-up chain data (True) or not (False).

  • metric (Literal['unit_e', 'diag_e', 'dense_e']) – String specifying metric type. One of “unit_e”, “diag_e” or “dense_e”, indicating respectively to used a fixed identity matrix metric representation, to use a diagonal metric matrix representation adapted based on estimates of the marginal posterior variances, to use a dense metric matrix representation based on estimates of the posterior covariance matrix.

  • stepsize (float) – Initial integrator step size.

  • adapt_engaged (bool) – Whether adaptation is engaged (True) or not (False).

  • delta (float) – Adaptation target acceptance statistic.

  • gamma (float) – Adaptation regularization scale.

  • kappa (float) – Adaptation relaxation exponent.

  • t0 (int) – Adaptation iteration offset.

  • init_buffer (int) – Width of initial fast adaptation interval.

  • term_buffer (int) – Width of final fast adaptation interval.

  • window (int) – Initial width of slow adaptation interval.

  • max_depth (int) – Maximum depth of binary trajectory tree.

  • seed (Optional[int]) – Seed for Numpy random number generator used for generating random variables while sampling chains. If None then generator will be seeded with entropy from operating system.

  • return_inferencedata (bool) – Whether to return the traces as an arviz.InferenceData (True) object or a dict (False).

Returns:

A dictionary or ArviZ InferenceData object containing the sampled chain output. Dictionary output (when return_inferencedata=False) has string keys corresponding to the name of each traced variable in the model, with the values being the corresponding values of the variables traced across the chains as NumPy arrays, with the first dimension the flattened draw index across all chains (of size equal to num_chains * num_samples) and any remaining dimensions corresponding to the dimensions of the traced variable. If return_inferencedata=True an ArviZ InferenceData object is instead returned with traced chain data stored in the posterior group and additional chain statistics in the sample_stats group.

Return type:

Union[InferenceData, dict[str, ArrayLike]]