arviz_stats.thin#
- arviz_stats.thin(data, sample_dims='draw', group='posterior', var_names=None, filter_vars=None, coords=None, factor='auto', chain_axis=0, draw_axis=1)[source]#
Perform thinning.
Thinning refers to retaining only every nth sample from a Markov Chain Monte Carlo (MCMC) simulation. This is usually done to reduce autocorrelation in the stored samples or simply to reduce the size of the stored samples.
- Parameters:
- dataarray_like,
xarray.DataArray
,xarray.Dataset
,xarray.DataTree
,DataArrayGroupBy
,DatasetGroupBy
, or idata-like Input data. It will have different pre-processing applied to it depending on its type:
array-like: call array layer within
arviz-stats
.xarray object: apply dimension aware function to all relevant subsets
others: passed to
arviz_base.convert_to_dataset
- sample_dimsiterable of
hashable
, optional Dimensions to be considered sample dimensions and are to be reduced. Default
rcParams["data.sample_dims"]
.- group
hashable
, default “posterior” Group on which to compute the ESS.
- var_names
str
orlist
ofstr
, optional Names of the variables for which the ess should be computed.
- filter_vars{
None
, “like”, “regex”}, defaultNone
- coords
dict
, optional Dictionary of dimension/index names to coordinate values defining a subset of the data for which to perform the computation.
- factor
str
orint
, default “auto” The thinning factor. If “auto”, the thinning factor is computed based on bulk and tail effective sample size as suggested by Säilynoja et al. (2022). If an integer, the thinning factor is set to that value.
- chain_axis, draw_axis
int
, optional Integer indicators of the axis that correspond to the chain and the draw dimension. chain_axis can be
None
.
- dataarray_like,
- Returns:
ndarray
,xarray.DataArray
,xarray.Dataset
,xarray.DataTree
Thinned samples
Examples
Thin the posterior samples using the default arguments:
In [1]: from arviz_base import load_arviz_data ...: import arviz_stats as azs ...: data = load_arviz_data('non_centered_eight') ...: azs.thin(data) ...: Out[1]: <xarray.DataTree 'posterior'> Group: /posterior Dimensions: (chain: 4, draw: 250, school: 8) Coordinates: * chain (chain) int64 32B 0 1 2 3 * draw (draw) int64 2kB 0 2 4 6 8 10 12 14 ... 486 488 490 492 494 496 498 * school (school) <U16 512B 'Choate' 'Deerfield' ... 'Mt. Hermon' Data variables: mu (chain, draw) float64 8kB 6.339 5.168 2.881 ... 7.11 0.4987 1.325 theta_t (chain, draw, school) float64 64kB -0.9553 -1.162 ... -0.4102 tau (chain, draw) float64 8kB 2.574 1.832 2.353 ... 0.9896 8.401 2.333 theta (chain, draw, school) float64 64kB 3.88 3.348 6.233 ... 0.138 0.368
Thin a subset of the variables with a thinning factor of 10:
In [2]: azs.thin(data, factor=10, var_names=["mu"]) Out[2]: <xarray.DataTree 'posterior'> Group: /posterior Dimensions: (chain: 4, draw: 50) Coordinates: * chain (chain) int64 32B 0 1 2 3 * draw (draw) int64 400B 0 10 20 30 40 50 60 ... 440 450 460 470 480 490 Data variables: mu (chain, draw) float64 2kB 6.339 4.818 0.002546 ... -2.524 2.197