arviz_stats.histogram

Contents

arviz_stats.histogram#

arviz_stats.histogram(data, dim=None, group='posterior', var_names=None, filter_vars=None, coords=None, bins=None, range=None, weights=None, density=True)[source]#

Compute the batched histogram.

See the EABM chapter on Visualization of Random Variables with ArviZ for more details.

Parameters:
dataarray_like, xarray.DataArray, xarray.Dataset, xarray.DataTree, DataArrayGroupBy, DatasetGroupBy, or idata-like

Input data. It will have different pre-processing applied to it depending on its type:

  • array-like: call array layer within arviz-stats.

  • xarray object: apply dimension aware function to all relevant subsets

  • others: passed to arviz_base.convert_to_dataset then treated as xarray.Dataset. This option is discouraged due to needing this conversion which is completely automated and will be needed again in future executions or similar functions.

    It is recommended to first perform the conversion manually and then call arviz_stats.histogram. This allows controlling the conversion step and inspecting its results.

dimsequence of hashable, optional

Dimensions to be reduced when computing the histogram. Default rcParams["data.sample_dims"].

grouphashable, default “posterior”

Group on which to compute the histogram

var_namesstr or list of str, optional

Names of the variables for which the histogram should be computed.

filter_vars{None, “like”, “regex”}, default None
coordsdict, optional

Dictionary of dimension/index names to coordinate values defining a subset of the data for which to perform the computation.

bindarray_like, optional
rangearray_like, optional
weightsarray_like, optional
densitybool, default True
**kwargsany, optional

Forwarded to the array or dataarray interface for histogram.

Returns:
ndarray, xarray.DataArray, xarray.Dataset, xarray.DataTree

Requested histogram of the provided input. It will have a hist_dim_{var_name} dimension and a plot_axis dimension with coordinates “histogram”, “left_edges” and “right_edges”

See also

arviz_stats.ecdf, arviz_stats.kde, arviz_stats.qds

Alternative visual summaries for marginal distributions

arviz_plots.plot_dist

Examples

Calculate the histogram of a Normal random variable:

In [1]: import arviz_stats as azs
   ...: import numpy as np
   ...: data = np.random.default_rng().normal(size=2000)
   ...: azs.histogram(data)
   ...: 
Out[1]: 
(array([0.01210468, 0.03550706, 0.11136304, 0.2671099 , 0.36233336,
        0.3873497 , 0.2558122 , 0.13637937, 0.03308612, 0.0112977 ,
        0.00080698, 0.00080698]),
 array([-3.10717258, -2.48757742, -1.86798225, -1.24838709, -0.62879192,
        -0.00919676,  0.61039841,  1.22999357,  1.84958873,  2.4691839 ,
         3.08877906,  3.70837423,  4.32796939]))

Calculate the histogram for specific variables:

In [2]: import arviz_base as azb
   ...: dt = azb.load_arviz_data("centered_eight")
   ...: azs.histogram(dt, var_names=["mu", "theta"])
   ...: 
Out[2]: 
<xarray.DataTree 'posterior'>
Group: /posterior
    Dimensions:    (plot_axis: 3, hist_dim_mu: 12, school: 8, hist_dim_theta: 12)
    Coordinates:
      * plot_axis  (plot_axis) <U11 132B 'histogram' 'left_edges' 'right_edges'
      * school     (school) <U16 512B 'Choate' 'Deerfield' ... 'Mt. Hermon'
    Dimensions without coordinates: hist_dim_mu, hist_dim_theta
    Data variables:
        mu         (plot_axis, hist_dim_mu) float64 288B 0.001653 0.005903 ... 17.9
        theta      (plot_axis, school, hist_dim_theta) float64 2kB 0.0005864 ... ...

Calculate the histogram also over the school dimension (for variables where present):

In [3]: azs.histogram(dt, dim=["chain", "draw", "school"])
Out[3]: 
<xarray.DataTree 'posterior'>
Group: /posterior
    Dimensions:    (plot_axis: 3, hist_dim_mu: 12, hist_dim_theta: 15,
                    hist_dim_tau: 12)
    Coordinates:
      * plot_axis  (plot_axis) <U11 132B 'histogram' 'left_edges' 'right_edges'
    Dimensions without coordinates: hist_dim_mu, hist_dim_theta, hist_dim_tau
    Data variables:
        mu         (plot_axis, hist_dim_mu) float64 288B 0.001653 0.005903 ... 17.9
        theta      (plot_axis, hist_dim_theta) float64 360B 7.385e-05 ... 46.46
        tau        (plot_axis, hist_dim_tau) float64 288B 0.2336 0.1504 ... 20.49