arviz_stats.hdi#
- arviz_stats.hdi(data, prob=None, dim=None, group='posterior', var_names=None, filter_vars=None, coords=None, method='nearest', circular=False, max_modes=10, skipna=False, **kwargs)[source]#
Compute the highest density interval (HDI) given a probability.
The HDI is the shortest interval that contains the specified probability mass.
- Parameters:
- dataarray_like,
xarray.DataArray
,xarray.Dataset
,xarray.DataTree
,DataArrayGroupBy
,DatasetGroupBy
, or idata-like Input data. It will have different pre-processing applied to it depending on its type:
array-like: call array layer within
arviz-stats
.xarray object: apply dimension aware function to all relevant subsets
others: passed to
arviz_base.convert_to_dataset
then treated asxarray.Dataset
. This option is discouraged due to needing this conversion which is completely automated and will be needed again in future executions or similar functions.It is recommended to first perform the conversion manually and then call
arviz_stats.hdi
. This allows controlling the conversion step and inspecting its results.
- prob
float
, optional Probability for the credible interval. Defaults to
rcParams["stats.ci_prob"]
- dimsequence of
hashable
, optional Dimensions to be reduced when computing the HDI. Default
rcParams["data.sample_dims"]
.- group
hashable
, default “posterior” Group on which to compute the HDI.
- var_names
str
orlist
ofstr
, optional Names of the variables for which the HDI should be computed.
- filter_vars{
None
, “like”, “regex”}, defaultNone
- coords
dict
, optional Dictionary of dimension/index names to coordinate values defining a subset of the data for which to perform the computation.
- method
str
, default “nearest” Valid options are “nearest”, “multimodal” or “multimodal_sample”
- circularbool, default
False
Whether to compute the HDI taking into account that data represent circular variables (in the range [-np.pi, np.pi]) or not. Defaults to False (i.e non-circular variables).
- max_modes
int
, default 10 Maximum number of modes to consider when computing the HDI using the multimodal method.
- skipnabool, default
False
If true ignores nan values when computing the HDI.
- **kwargs
any
, optional Forwarded to the array or dataarray interface for HDI.
- dataarray_like,
- Returns:
ndarray
,xarray.DataArray
,xarray.Dataset
,xarray.DataTree
Requested HDI of the provided input. It will have a
ci_bound
dimension with coordinate values “lower” and “upper” indicating the two extremes of the credible interval. In addition when using a multimodal method amode
dimension is also added.
See also
arviz_stats.eti
Calculate the equal tail interval (ETI).
arviz_stats.summary
Calculate summary statistics and diagnostics.
Examples
Calculate the HDI of a Normal random variable:
In [1]: import arviz_stats as azs ...: import numpy as np ...: data = np.random.default_rng().normal(size=2000) ...: azs.hdi(data, 0.68) ...: Out[1]: array([-0.93126529, 1.08387296])
Calculate the HDI for specific variables:
In [2]: import arviz_base as azb ...: dt = azb.load_arviz_data("centered_eight") ...: azs.hdi(dt, var_names=["mu", "theta"]) ...: Out[2]: <xarray.DataTree 'posterior'> Group: /posterior Dimensions: (ci_bound: 2, school: 8) Coordinates: * ci_bound (ci_bound) <U5 40B 'lower' 'upper' * school (school) <U16 512B 'Choate' 'Deerfield' ... 'Mt. Hermon' Data variables: mu (ci_bound) float64 16B -1.623 10.69 theta (school, ci_bound) float64 128B -4.564 17.13 ... -5.858 16.01
Calculate the HDI also over the school dimension (for variables where present):
In [3]: azs.hdi(dt, dim=["chain","draw", "school"]) Out[3]: <xarray.DataTree 'posterior'> Group: /posterior Dimensions: (ci_bound: 2) Coordinates: * ci_bound (ci_bound) <U5 40B 'lower' 'upper' Data variables: mu (ci_bound) float64 16B -1.623 10.69 theta (ci_bound) float64 16B -5.719 14.86 tau (ci_bound) float64 16B 0.8965 9.668