shap#

Custom plots for SHAP analysis visualization

Functions

one_shap_value(shap_value_new, ...)

Visualize the shap values of one data point

partial_dependence(ind, model, data[, ...])

Calculates and plots the partial dependence of a feature or a pair of features on the model's output.

obsidian.plotting.shap.one_shap_value(shap_value_new: ndarray, expected_value: float, X_names: list[str]) tuple[Figure, Figure][source]#

Visualize the shap values of one data point

Parameters:
  • shap_value_new (np.ndarray) – The SHAP values of a single data point to be compared to a reference point.

  • expected_value (float) – The expected value at the reference point.

  • X_names (list[str]) – The names of the features.

Returns:

The bar plot of SHAP values for the single data point. Figure: The line plot of cumulative SHAP values for the data point in

comparison to the reference point.

Return type:

Figure

obsidian.plotting.shap.partial_dependence(ind: int | tuple[int], model: Callable, data: DataFrame, ice_color_var: int | None = None, xmin: str | tuple[float] | float = 'percentile(0)', xmax: str | tuple[float] | float = 'percentile(100)', npoints: int | None = None, hist: bool = False, ylabel: str | None = None, ice: bool = True, ace_opacity: float = 1, pd_opacity: float = 1, pd_linewidth: float = 2, ace_linewidth: str | float = 'auto', ax: Axes | None = None, show: bool = True) Figure[source]#

Calculates and plots the partial dependence of a feature or a pair of features on the model’s output.

This function is revised from the partial_dependence_plot function in shap package, in order to color the ICE curves by certain feature for checking interaction between features. Ref: shap/shap

Parameters:
  • ind (int | tuple) – The index or indices of the feature(s) to calculate the partial dependence for.

  • model (Callable) – The model used for prediction.

  • data (pd.DataFrame) – The input data used for prediction.

  • ice_color_var (int, optional) – The index of the feature used for coloring the ICE lines (for 1D partial dependence plot). Default is 0.

  • xmin (str | tuple | float, optional) – The minimum value(s) for the feature(s) range. Default is "percentile(0)".

  • xmax (str | tuple | float) – The maximum value(s) for the feature(s) range. Default is "percentile(100)".

  • npoints (int, optional) – The number of points to sample within the feature(s) range. By default, will use 100 points for 1D PDP and 20 points for 2D PDP.

  • hist (bool, optional) – Whether to plot the histogram of the feature(s). Default is False.

  • ylabel (str, optional) – The label for the y-axis. Default is None.

  • ice (bool, optional) – Whether to plot the Individual Conditional Expectation (ICE) lines. Default is True.

  • ace_opacity (float, optional) – The opacity of the ACE lines. Default is 1.

  • pd_opacity (float, optional) – The opacity of the PDP line. Default is 1.

  • pd_linewidth (float, optional) – The linewidth of the PDP line. Default is 2.

  • ace_linewidth (float | str, optional) – The linewidth of the ACE lines. Default is 'auto' for automatic calculation.

  • ax (Axes, optional) – The matplotlib axis to plot on. By default will attach to Figure.gca().

  • show (bool, optional) – Whether to show the plot. Default is True.

Returns:

A tuple containing the matplotlib figure and axis objects if show is False, otherwise None.

Return type:

tuple