arviz_stats.psense_summary

Contents

arviz_stats.psense_summary#

arviz_stats.psense_summary(data, var_names=None, filter_vars=None, coords=None, sample_dims=None, threshold=0.05, alphas=(0.99, 1.01), prior_var_names=None, likelihood_var_names=None, prior_coords=None, likelihood_coords=None, round_to=3)[source]#

Compute the prior/likelihood sensitivity based on power-scaling perturbations.

Parameters:
dataxarray.DataTree or InferenceData

Input data. It should contain the posterior and the log_likelihood and/or log_prior groups.

var_nameslist of str, optional

Names of posterior variables to include in the power scaling sensitivity diagnostic

filter_vars: {None, “like”, “regex”}, default None

Used for var_names only. If None (default), interpret var_names as the real variables names. If “like”, interpret var_names as substrings of the real variables names. If “regex”, interpret var_names as regular expressions on the real variables names.

coordsdict, optional

Coordinates defining a subset over the posterior. Only these variables will be used when computing the prior sensitivity.

sample_dimsstr or sequence of hashable, optional

Dimensions to reduce unless mapped to an aesthetic. Defaults to rcParams["data.sample_dims"]

thresholdfloat, optional

Threshold value to determine the sensitivity diagnosis. Default is 0.05.

alphastuple

Lower and upper alpha values for gradient calculation. Defaults to (0.99, 1.01).

prior_var_namesstr, optional

Name of the log-prior variables to include in the power scaling sensitivity diagnostic

likelihood_var_namesstr, optional

Name of the log-likelihood variables to include in the power scaling sensitivity diagnostic

prior_coordsdict, optional

Coordinates defining a subset over the group element for which to compute the log-prior sensitivity diagnostic

likelihood_coordsdict, optional

Coordinates defining a subset over the group element for which to compute the log-likelihood sensitivity diagnostic

round_toint, optional

Number of decimal places to round the sensitivity values. Default is 3.

Returns:
psense_dfDataFrame

DataFrame containing the prior and likelihood sensitivity values for each variable in the data. And a diagnosis column with the following values: - “prior-data conflict” if both prior and likelihood sensitivity are above threshold - “strong prior / weak likelihood” if the prior sensitivity is above threshold and the likelihood sensitivity is below the threshold - “-” otherwise

Notes

The diagnostic is computed by power-scaling either the prior or likelihood and determining the degree to which the posterior changes as described in [1]. It uses Pareto-smoothed importance sampling to avoid refitting the model.

References

[1]

Kallioinen et al, Detecting and diagnosing prior and likelihood sensitivity with power-scaling, Stat Comput 34, 57 (2024), https://doi.org/10.1007/s11222-023-10366-5

Examples

In [1]: from arviz_base import load_arviz_data
   ...: from arviz_stats import psense_summary
   ...: rugby = load_arviz_data("rugby")
   ...: psense_summary(rugby, var_names="atts")
   ...: 
Out[1]: 
                prior  likelihood                      diagnosis
atts[Wales]     0.041       0.108                              ✓
atts[France]    0.018       0.084                              ✓
atts[Ireland]   0.022       0.077                              ✓
atts[Scotland]  0.016       0.078                              ✓
atts[Italy]     0.072       0.112  potential prior-data conflict
atts[England]   0.059       0.094  potential prior-data conflict