explainer#

Explainer Class: Surrogate Model Interpretation Methods

Classes

Explainer(optimizer[, X_space])

Base class for surrogate model interpretation and post-hoc analysis.

class obsidian.campaign.explainer.Explainer(optimizer: Optimizer, X_space: ParamSpace | None = None)[source]#

Bases: object

Base class for surrogate model interpretation and post-hoc analysis.

Properties:: optimizer (Optimizer): obsidian Optimizer object with fitted surrogate model(s).

X_space#

obsidian ParamSpace object representing the allowable space for model explanation, could be different from the optimizer.X_space used for optimization.

Type:: ParamSpace

responseid#

the index of a single outcome whose surrogate model will be explained by shap

Type:: int

shap#

A dictionary of values containing shapley analysis results

Type:: dict

Raises:: ValueError – If the optimizer is not fit

property optimizer: Optimizer#: Explainer Optimizer object

sensitivity(dx: float = 1e-06, X_ref: DataFrame | Series | None = None) → DataFrame[source]#

Calculates the local sensitivity of the surrogate model predictions with respect to each parameter in the X_space.

Parameters:

optimizer (BayesianOptimizer) – The optimizer object which contains a surrogate that has been fit to data
predictions. (and can be used to make)
dx (float, optional) – The perturbation size for calculating the sensitivity. Defaults to 1e-6.
X_ref (pd.DataFrame | pd.Series | None, optional) – The reference input values for calculating the sensitivity. If None, the mean of X_space will be used as the reference. Defaults to None.

Returns:

A DataFrame containing the sensitivity values for each parameter: in X_space.

Return type:

pd.DataFrame

Raises:

ValueError – If X_ref does not contain all parameters in optimizer.X_space or if X_ref is not a single row DataFrame.

set_optimizer(optimizer: Optimizer) → None[source]#: Sets the explainer optimizer

shap_explain(responseid: int = 0, n: int = 100, X_ref: DataFrame | None = None, seed: int | None = None) → None[source]#

Explain the parameter sensitivities using shap values.

Parameters:

responseid (int) – Index of the target response variable.
n (int) – Number of samples to generate for shap values.
X_ref (pd.DataFrame | None) – Reference DataFrame for shap values. If None, the mean of self.X_space will be used.
seed (int | None) – Seed value for random number generation.

Returns:

This function fits a Kernel shap explainer and save results as class attributes.

Return type:

None

Raises:

ValueError – If X_ref does not contain all parameters in self.X_space or if X_ref is not a single row DataFrame.

shap_pdp_ice(ind: int | tuple[int] = 0, ice_color_var: int | None = None, hist: bool = False, ace_opacity: float = 0.5, npoints: int | None = None) → Figure[source]#

SHAP Partial Dependence Plot with ICE

Parameters:

ind (int) – Index of the parameter to plot
ice_color_var (int) – Index of the parameter to color the ICE lines
hist (bool, optional) – Show histogram of the feature values. By default False
ace_opacity (float, optional) – Opacity of the ACE line. By default 0.5
npoints (int, optional) – Number of points for PDP x-axis. By default will use 100 for 1D PDP and 20 for 2D PDP.

Returns:

Matplotlib Figure of 1D or 2D PDP with ICE lines

shap_single_point(X_new: DataFrame | Series, X_ref=None) → tuple[DataFrame, Figure, Figure][source]#

SHAP Pair-wise Marginal Explanations

Parameters:

X_new (pd.DataFrame | pd.Series) – New data point to explain
X_ref (pd.DataFrame | pd.Series, optional) – Reference data point for shap values. Default uses optimizer.X_best_f

Returns:

DataFrame containing SHAP values for the new data point Figure: Matplotlib Figure for SHAP values Figure: Matplotlib Figure for SHAP summary plot

Return type:

pd.DataFrame

shap_summary() → Figure[source]#: SHAP Summary Plot (Beeswarm)

shap_summary_bar() → Figure[source]#: SHAP Summary Plot (Bar Plot / Importance)