arviz_stats.ecdf#
- arviz_stats.ecdf(data, dim=None, group='posterior', var_names=None, filter_vars=None, coords=None, pit=False, **kwargs)[source]#
Compute the marginal empirical cumulative density functions (ECDF).
See the EABM chapter on Visualization of Random Variables with ArviZ for more details.
- Parameters:
- dataarray_like,
xarray.DataArray
,xarray.Dataset
,xarray.DataTree
,DataArrayGroupBy
,DatasetGroupBy
, or idata-like Input data. It will have different pre-processing applied to it depending on its type:
array-like: call array layer within
arviz-stats
.xarray object: apply dimension aware function to all relevant subsets
others: passed to
arviz_base.convert_to_dataset
then treated asxarray.Dataset
. This option is discouraged due to needing this conversion which is completely automated and will be needed again in future executions or similar functions.It is recommended to first perform the conversion manually and then call
arviz_stats.ecdf
. This allows controlling the conversion step and inspecting its results.
- dimsequence of
hashable
, optional Dimensions to be reduced when computing the ECDF. Default
rcParams["data.sample_dims"]
.- group
hashable
, default “posterior” Group on which to compute the ECDF
- var_names
str
orlist
ofstr
, optional Names of the variables for which the ECDF should be computed.
- filter_vars{
None
, “like”, “regex”}, defaultNone
- coords
dict
, optional Dictionary of dimension/index names to coordinate values defining a subset of the data for which to perform the computation.
- pitbool, default
False
- **kwargs
any
, optional Forwarded to the array or dataarray interface for ECDF.
- dataarray_like,
- Returns:
ndarray
,xarray.DataArray
,xarray.Dataset
,xarray.DataTree
Requested ECDF of the provided input. It will have a
quantile
dimension and aplot_axis
dimension with coordinate values “x” and “y”.
See also
arviz_stats.histogram
,arviz_stats.kde
,arviz_stats.qds
Alternative visual summaries for marginal distributions
arviz_plots.plot_dist
Examples
Calculate the ECDF of a Normal random variable:
In [1]: import arviz_stats as azs ...: import numpy as np ...: data = np.random.default_rng().normal(size=2000) ...: # not available yet in array interface azs.ecdf(data) ...:
Calculate the ECDF for specific variables:
In [2]: import arviz_base as azb ...: dt = azb.load_arviz_data("centered_eight") ...: azs.ecdf(dt.posterior.dataset, var_names=["mu", "theta"]) ...: Out[2]: <xarray.Dataset> Size: 6kB Dimensions: (plot_axis: 2, quantile: 200) Coordinates: * plot_axis (plot_axis) <U1 8B 'x' 'y' Dimensions without coordinates: quantile Data variables: mu (plot_axis, quantile) float64 3kB -7.509 -7.382 ... 0.9995 1.0 theta (plot_axis, quantile) float64 3kB -29.71 -29.33 ... 0.9999 1.0
Calculate the ECDF also over the school dimension (for variables where present):
In [3]: azs.ecdf(dt.posterior.dataset, dim=["chain", "draw", "school"]) Out[3]: <xarray.Dataset> Size: 10kB Dimensions: (plot_axis: 2, quantile: 200) Coordinates: * plot_axis (plot_axis) <U1 8B 'x' 'y' Dimensions without coordinates: quantile Data variables: mu (plot_axis, quantile) float64 3kB -7.509 -7.382 ... 0.9995 1.0 theta (plot_axis, quantile) float64 3kB -29.71 -29.33 ... 0.9999 1.0 tau (plot_axis, quantile) float64 3kB 0.8965 0.9949 ... 0.9995 1.0