API reference#

Note

Similarly to xarray, ArviZ aims for idempotent functions. However, there are two important caveats. First and foremost, several functions require combining data from multiple groups or variables, in which case the output won’t keep the type of the input data.

Moreover, ArviZ works on data following the InferenceData schema, there are functions that can accept PPL outputs directly, but when this happens, the first step is converting those outputs to InferenceData. Consequently, the output won’t be of the same type.

As indicated in Installation, the recommended way to install arviz-stats is with xarray and arviz-base as optional dependencies. This activates most of the features of the libraries and the bulk of this reference guide documents them. If you only installed the minimal version of arviz-stats you should jump to Array facing functions and read forward.

Top level functions#

Diagnostics#

arviz_stats.ess(data[, sample_dims, group, ...])

Estimate the effective sample size (ess).

arviz_stats.loo_pit(data[, var_names, ...])

Compute leave one out (PSIS-LOO) probability integral transform (PIT) values.

arviz_stats.mcse(data[, sample_dims, group, ...])

Calculate Markov Chain Standard Error statistic.

arviz_stats.psense(data[, var_names, ...])

Compute power-scaling sensitivity values.

arviz_stats.psense_summary(data[, ...])

Compute the prior/likelihood sensitivity based on power-scaling perturbations.

arviz_stats.rhat(data[, sample_dims, group, ...])

Compute estimate of rank normalized split R-hat for a set of traces.

arviz_stats.rhat_nested(data[, sample_dims, ...])

Compute nested R-hat.

Statistical summaries#

arviz_stats.bayes_factor(data, var_names[, ...])

Compute Bayes factor using Savage–Dickey ratio.

arviz_stats.ci_in_rope(data, rope[, ...])

Compute the percentage of a credible interval that falls within a ROPE.

arviz_stats.ecdf(data[, dim, group, ...])

Compute the marginal empirical cumulative density functions (ECDF).

arviz_stats.eti(data[, prob, dim, group, ...])

Compute the equal tail interval (ETI) given a probability.

arviz_stats.hdi(data[, prob, dim, group, ...])

Compute the highest density interval (HDI) given a probability.

arviz_stats.histogram(data[, dim, group, ...])

Compute the batched histogram.

arviz_stats.kde(data[, dim, group, ...])

Compute the marginal kernel density estimates (KDE).

arviz_stats.kl_divergence(data1, data2[, ...])

Compute the Kullback-Leibler (KL) divergence.

arviz_stats.loo_expectations(data[, ...])

Compute weighted expectations using the PSIS-LOO-CV method.

arviz_stats.loo_metrics(data[, kind, ...])

Compute predictive metrics using the PSIS-LOO-CV method.

arviz_stats.metrics(data[, kind, var_name, ...])

Compute performace metrics.

arviz_stats.mode(data[, dim, group, ...])

Compute the mode.

arviz_stats.qds(data[, dim, group, ...])

Compute the marginal quantile dots.

arviz_stats.r2_score(data[, var_name, ...])

R² for Bayesian regression models.

arviz_stats.summary(data[, var_names, ...])

Create a data frame with summary statistics and or diagnostics.

arviz_stats.wasserstein(data1, data2[, ...])

Compute the Wasserstein-1 distance.

Model comparison#

arviz_stats.compare(compare_dict[, method, ...])

Compare models based on their expected log pointwise predictive density (ELPD).

arviz_stats.loo(data[, pointwise, var_name, ...])

Compute Pareto-smoothed importance sampling leave-one-out cross-validation (PSIS-LOO-CV).

arviz_stats.loo_i(i, data[, var_name, reff, ...])

Compute PSIS-LOO-CV for a single observation.

arviz_stats.loo_approximate_posterior(data, ...)

Compute PSIS-LOO-CV for approximate posteriors.

arviz_stats.loo_kfold(data, wrapper[, ...])

Perform exact K-fold cross-validation.

arviz_stats.loo_moment_match(data, loo_orig, ...)

Compute moment matching for problematic observations in PSIS-LOO-CV.

arviz_stats.loo_subsample(data, observations)

Compute PSIS-LOO-CV using sub-sampling.

arviz_stats.reloo(wrapper[, loo_orig, ...])

Recalculate exact Leave-One-Out cross validation refitting where the approximation fails.

arviz_stats.update_subsample(loo_orig, data)

Update a sub-sampled PSIS-LOO-CV object with new observations.

Other#

arviz_stats.SamplingWrapper(model[, ...])

Class wrapping sampling routines for its usage via ArviZ.

arviz_stats.thin(data[, sample_dims, group, ...])

Perform thinning.

Accessors#

In addition, many functions are also available via accessors:

Dataset accessors#

xarray.Dataset.azstats.ds

Return the underlying Dataset.

xarray.Dataset.azstats.filter_vars([...])

Filter variables in the dataset.

xarray.Dataset.azstats.eti([prob, dim])

Compute the equal tail interval.

xarray.Dataset.azstats.hdi([prob, dim])

Compute hdi on all variables in the dataset.

xarray.Dataset.azstats.compute_ranks([dim, ...])

Compute ranks for all variables in the dataset.

xarray.Dataset.azstats.ess([sample_dims, ...])

Compute the ess of all the variables in the dataset.

xarray.Dataset.azstats.rhat([sample_dims, ...])

Compute the rhat of all the variables in the dataset.

xarray.Dataset.azstats.rhat_nested([...])

Compute nested rhat of all the variables in the dataset.

xarray.Dataset.azstats.mcse([sample_dims, ...])

Compute the mcse of all the variables in the dataset.

xarray.Dataset.azstats.thin([sample_dims, ...])

Perform thinning for all the variables in the dataset.

xarray.Dataset.azstats.kde([dim])

Compute the KDE for all variables in the dataset.

xarray.Dataset.azstats.histogram([dim])

Compute the histogram for all variables in the dataset.

xarray.Dataset.azstats.ecdf([dim, pit])

Compute the ecdf for all variables in the dataset.

xarray.Dataset.azstats.autocorr([dim])

Compute autocorrelation for all variables in the dataset.

DataArray facing functions#

Base submodule#

arviz_stats.base.dataarray_stats.eti(da[, ...])

Compute eti on DataArray input.

arviz_stats.base.dataarray_stats.hdi(da[, ...])

Compute hdi on DataArray input.

arviz_stats.base.dataarray_stats.ess(da[, ...])

Compute ess on DataArray input.

arviz_stats.base.dataarray_stats.rhat(da[, ...])

Compute rhat on DataArray input.

arviz_stats.base.dataarray_stats.rhat_nested(da, ...)

Compute nested rhat on DataArray input.

arviz_stats.base.dataarray_stats.mcse(da[, ...])

Compute mcse on DataArray input.

arviz_stats.base.dataarray_stats.histogram(da)

Compute histogram on DataArray input.

arviz_stats.base.dataarray_stats.kde(da[, ...])

Compute kde on DataArray input.

arviz_stats.base.dataarray_stats.autocorr(da)

Compute autocorrelation on DataArray input.

Numba submodule#

The numba accelerated computations are available as the same methods but of the arviz_stats.numba.dataarray_stats class. Both their API and implementation is the same as for the base module, the only difference being that one calls arviz_stats.base.array_stats for array facing functions whereas the other one calls arviz_stats.numba.array_stats.

Implementation differences are thus documented below, at the array facing classes.

Array facing functions#

All functions and methods described after this point work when installing arviz-stats without optional dependencies.

Warning

Keep in mind this is not the recommended install and the main target of such functions are developers of other libraries who want to use ArviZ but keep the dependency list small.

The documentation is more bare bones than other functions and will often refer you to other pages for the full argument or algorithm descriptions.

Base submodule#

Sampling diagnostics#

arviz_stats.base.array_stats.ess(ary[, ...])

Compute of ess on array-like inputs.

arviz_stats.base.array_stats.pareto_min_ss(ary)

Compute minimum effective sample size.

arviz_stats.base.array_stats.rhat(ary[, ...])

Compute of rhat on array-like inputs.

arviz_stats.base.array_stats.rhat_nested(...)

Compute nested rhat on array-like inputs.

arviz_stats.base.array_stats.mcse(ary[, ...])

Compute of mcse on array-like inputs.

Statistical summaries#

arviz_stats.base.array_stats.eti(ary, prob)

Compute the equal tail interval (ETI) of an array of samples.

arviz_stats.base.array_stats.hdi(ary, prob)

Compute highest density interval (HDI) on an array of samples.

arviz_stats.base.array_stats.histogram(ary)

Compute histogram over provided axis.

arviz_stats.base.array_stats.kde(ary[, ...])

Compute of kde on array-like inputs.

arviz_stats.base.array_stats.quantile(ary, ...)

Compute the quantile of an array of samples.

Other#

arviz_stats.base.array_stats.autocorr(ary[, ...])

Compute autocorrelation using FFT for every lag for the input array.

arviz_stats.base.array_stats.autocov(ary[, axis])

Compute autocovariance estimates for every lag for the input array.

arviz_stats.base.array_stats.compute_ranks(ary)

Compute ranks of MCMC samples.

arviz_stats.base.array_stats.get_bins(ary[, ...])

Compute default bins.

arviz_stats.base.array_stats.psislw(ary[, ...])

Compute log weights for Pareto-smoothed importance sampling (PSIS) method.

Numba submodule#

Some functions are accelerated internally without changes to the public API, others are purely inherited from the base backend, and a last group is partially or completely reimplemented. This last group is documented here:

arviz_stats.numba.array_stats.quantile(ary, ...)

Compute the quantile.

arviz_stats.numba.array_stats.histogram(ary)

Compute histogram over provided axis.

arviz_stats.numba.array_stats.kde(ary[, ...])

Compute the guvectorized kde.