explainer#

Explainer Class: Surrogate Model Interpretation Methods

Classes

Explainer(optimizer[, X_space])

Base class for surrogate model interpretation and post-hoc analysis.

class obsidian.campaign.explainer.Explainer(optimizer: Optimizer, X_space: ParamSpace | None = None)[source]#

Bases: object

Base class for surrogate model interpretation and post-hoc analysis.

Properties:

optimizer (Optimizer): obsidian Optimizer object with fitted surrogate model(s).

X_space#

obsidian ParamSpace object representing the allowable space for model explanation, could be different from the optimizer.X_space used for optimization.

Type:

ParamSpace

responseid#

the index of a single outcome whose surrogate model will be explained by shap

Type:

int

shap#

A dictionary of values containing shapley analysis results

Type:

dict

Raises:

ValueError – If the optimizer is not fit

property optimizer: Optimizer#

Explainer Optimizer object

sensitivity(dx: float = 1e-06, X_ref: DataFrame | Series | None = None) DataFrame[source]#

Calculates the local sensitivity of the surrogate model predictions with respect to each parameter in the X_space.

Parameters:
  • optimizer (BayesianOptimizer) – The optimizer object which contains a surrogate that has been fit to data

  • predictions. (and can be used to make)

  • dx (float, optional) – The perturbation size for calculating the sensitivity. Defaults to 1e-6.

  • X_ref (pd.DataFrame | pd.Series | None, optional) – The reference input values for calculating the sensitivity. If None, the mean of X_space will be used as the reference. Defaults to None.

Returns:

A DataFrame containing the sensitivity values for each parameter

in X_space.

Return type:

pd.DataFrame

Raises:

ValueError – If X_ref does not contain all parameters in optimizer.X_space or if X_ref is not a single row DataFrame.

set_optimizer(optimizer: Optimizer) None[source]#

Sets the explainer optimizer

shap_explain(responseid: int = 0, n: int = 100, X_ref: DataFrame | None = None, seed: int | None = None) None[source]#

Explain the parameter sensitivities using shap values.

Parameters:
  • responseid (int) – Index of the target response variable.

  • n (int) – Number of samples to generate for shap values.

  • X_ref (pd.DataFrame | None) – Reference DataFrame for shap values. If None, the mean of self.X_space will be used.

  • seed (int | None) – Seed value for random number generation.

Returns:

This function fits a Kernel shap explainer and save results as class attributes.

Return type:

None

Raises:

ValueError – If X_ref does not contain all parameters in self.X_space or if X_ref is not a single row DataFrame.

shap_pdp_ice(ind: int | tuple[int] = 0, ice_color_var: int | None = None, hist: bool = False, ace_opacity: float = 0.5, npoints: int | None = None) Figure[source]#

SHAP Partial Dependence Plot with ICE

Parameters:
  • ind (int) – Index of the parameter to plot

  • ice_color_var (int) – Index of the parameter to color the ICE lines

  • hist (bool, optional) – Show histogram of the feature values. By default False

  • ace_opacity (float, optional) – Opacity of the ACE line. By default 0.5

  • npoints (int, optional) – Number of points for PDP x-axis. By default will use 100 for 1D PDP and 20 for 2D PDP.

Returns:

Matplotlib Figure of 1D or 2D PDP with ICE lines

shap_single_point(X_new: DataFrame | Series, X_ref=None) tuple[DataFrame, Figure, Figure][source]#

SHAP Pair-wise Marginal Explanations

Parameters:
  • X_new (pd.DataFrame | pd.Series) – New data point to explain

  • X_ref (pd.DataFrame | pd.Series, optional) – Reference data point for shap values. Default uses optimizer.X_best_f

Returns:

DataFrame containing SHAP values for the new data point Figure: Matplotlib Figure for SHAP values Figure: Matplotlib Figure for SHAP summary plot

Return type:

pd.DataFrame

shap_summary() Figure[source]#

SHAP Summary Plot (Beeswarm)

shap_summary_bar() Figure[source]#

SHAP Summary Plot (Bar Plot / Importance)