arviz_stats.psense_summary#
- arviz_stats.psense_summary(data, var_names=None, filter_vars=None, coords=None, sample_dims=None, threshold=0.05, alphas=(0.99, 1.01), prior_var_names=None, likelihood_var_names=None, prior_coords=None, likelihood_coords=None, round_to=3)[source]#
Compute the prior/likelihood sensitivity based on power-scaling perturbations.
- Parameters:
- data
xarray.DataTree
orInferenceData
Input data. It should contain the posterior and the log_likelihood and/or log_prior groups.
- var_names
list
ofstr
, optional Names of posterior variables to include in the power scaling sensitivity diagnostic
- filter_vars: {None, “like”, “regex”}, default None
Used for var_names only. If
None
(default), interpret var_names as the real variables names. If “like”, interpret var_names as substrings of the real variables names. If “regex”, interpret var_names as regular expressions on the real variables names.- coords
dict
, optional Coordinates defining a subset over the posterior. Only these variables will be used when computing the prior sensitivity.
- sample_dims
str
or sequence ofhashable
, optional Dimensions to reduce unless mapped to an aesthetic. Defaults to
rcParams["data.sample_dims"]
- threshold
float
, optional Threshold value to determine the sensitivity diagnosis. Default is 0.05.
- alphas
tuple
Lower and upper alpha values for gradient calculation. Defaults to (0.99, 1.01).
- prior_var_names
str
, optional Name of the log-prior variables to include in the power scaling sensitivity diagnostic
- likelihood_var_names
str
, optional Name of the log-likelihood variables to include in the power scaling sensitivity diagnostic
- prior_coords
dict
, optional Coordinates defining a subset over the group element for which to compute the log-prior sensitivity diagnostic
- likelihood_coords
dict
, optional Coordinates defining a subset over the group element for which to compute the log-likelihood sensitivity diagnostic
- round_to
int
, optional Number of decimal places to round the sensitivity values. Default is 3.
- data
- Returns:
- psense_df
DataFrame
DataFrame containing the prior and likelihood sensitivity values for each variable in the data. And a diagnosis column with the following values: - “prior-data conflict” if both prior and likelihood sensitivity are above threshold - “strong prior / weak likelihood” if the prior sensitivity is above threshold and the likelihood sensitivity is below the threshold - “-” otherwise
- psense_df
Notes
The diagnostic is computed by power-scaling either the prior or likelihood and determining the degree to which the posterior changes as described in [1]. It uses Pareto-smoothed importance sampling to avoid refitting the model.
References
[1]Kallioinen et al, Detecting and diagnosing prior and likelihood sensitivity with power-scaling, Stat Comput 34, 57 (2024), https://doi.org/10.1007/s11222-023-10366-5
Examples
In [1]: from arviz_base import load_arviz_data ...: from arviz_stats import psense_summary ...: rugby = load_arviz_data("rugby") ...: psense_summary(rugby, var_names="atts") ...: Out[1]: prior likelihood diagnosis atts[Wales] 0.041 0.108 ✓ atts[France] 0.018 0.084 ✓ atts[Ireland] 0.022 0.077 ✓ atts[Scotland] 0.016 0.078 ✓ atts[Italy] 0.072 0.112 potential prior-data conflict atts[England] 0.059 0.094 potential prior-data conflict