shap#
Custom plots for SHAP analysis visualization
Functions
|
Visualize the shap values of one data point |
|
Calculates and plots the partial dependence of a feature or a pair of features on the model's output. |
- obsidian.plotting.shap.one_shap_value(shap_value_new: ndarray, expected_value: float, X_names: list[str]) tuple[Figure, Figure] [source]#
Visualize the shap values of one data point
- Parameters:
shap_value_new (np.ndarray) – The SHAP values of a single data point to be compared to a reference point.
expected_value (float) – The expected value at the reference point.
X_names (list[str]) – The names of the features.
- Returns:
The bar plot of SHAP values for the single data point. Figure: The line plot of cumulative SHAP values for the data point in
comparison to the reference point.
- Return type:
Figure
- obsidian.plotting.shap.partial_dependence(ind: int | tuple[int], model: Callable, data: DataFrame, ice_color_var: int | None = None, xmin: str | tuple[float] | float = 'percentile(0)', xmax: str | tuple[float] | float = 'percentile(100)', npoints: int | None = None, hist: bool = False, ylabel: str | None = None, ice: bool = True, ace_opacity: float = 1, pd_opacity: float = 1, pd_linewidth: float = 2, ace_linewidth: str | float = 'auto', ax: Axes | None = None, show: bool = True) Figure [source]#
Calculates and plots the partial dependence of a feature or a pair of features on the model’s output.
This function is revised from the partial_dependence_plot function in shap package, in order to color the ICE curves by certain feature for checking interaction between features. Ref: shap/shap
- Parameters:
ind (int | tuple) – The index or indices of the feature(s) to calculate the partial dependence for.
model (Callable) – The model used for prediction.
data (pd.DataFrame) – The input data used for prediction.
ice_color_var (int, optional) – The index of the feature used for coloring the ICE lines (for 1D partial dependence plot). Default is
0
.xmin (str | tuple | float, optional) – The minimum value(s) for the feature(s) range. Default is
"percentile(0)"
.xmax (str | tuple | float) – The maximum value(s) for the feature(s) range. Default is
"percentile(100)"
.npoints (int, optional) – The number of points to sample within the feature(s) range. By default, will use
100
points for 1D PDP and20
points for 2D PDP.hist (bool, optional) – Whether to plot the histogram of the feature(s). Default is
False
.ylabel (str, optional) – The label for the y-axis. Default is
None
.ice (bool, optional) – Whether to plot the Individual Conditional Expectation (ICE) lines. Default is
True
.ace_opacity (float, optional) – The opacity of the ACE lines. Default is
1
.pd_opacity (float, optional) – The opacity of the PDP line. Default is
1
.pd_linewidth (float, optional) – The linewidth of the PDP line. Default is
2
.ace_linewidth (float | str, optional) – The linewidth of the ACE lines. Default is
'auto'
for automatic calculation.ax (Axes, optional) – The matplotlib axis to plot on. By default will attach to Figure.gca().
show (bool, optional) – Whether to show the plot. Default is
True
.
- Returns:
A tuple containing the matplotlib figure and axis objects if show is False, otherwise None.
- Return type:
tuple