Package smile.feature.importance
Interface SHAP<T>
- Type Parameters:
T
- the data type of model input objects.
- All Known Subinterfaces:
TreeSHAP
- All Known Implementing Classes:
AdaBoost
,CART
,DecisionTree
,GradientTreeBoost
,GradientTreeBoost
,RandomForest
,RandomForest
,RegressionTree
public interface SHAP<T>
SHAP (SHapley Additive exPlanations) is a game theoretic approach to
explain the output of any machine learning model. It connects optimal
credit allocation with local explanations using the classic Shapley
values from game theory.
SHAP leverages local methods designed to explain a prediction
f(x)
based on a single input x
.
The local methods are defined as any interpretable approximation
of the original model. In particular, SHAP employs additive feature
attribution methods.
SHAP values attribute to each feature the change in the expected
model prediction when conditioning on that feature. They explain
how to get from the base value E[f(z)]
that would be
predicted if we did not know any features to the current output
f(x)
.
In game theory, the Shapley value is the average expected marginal contribution of one player after all possible combinations have been considered.
References
- Lundberg, Scott M., and Su-In Lee. A unified approach to interpreting model predictions. NIPS, 2017.
- Lundberg, Scott M., Gabriel G. Erion, and Su-In Lee. Consistent individualized feature attribution for tree ensembles.
-
Method Summary
-
Method Details
-
shap
Returns the SHAP values. For regression, the length of SHAP values is same as the number of features. For classification, SHAP values are ofp x k
, wherep
is the number of features andk
is the classes. The first k elements are the SHAP values of first feature over k classes, respectively. The rest features follow accordingly.- Parameters:
x
- an instance.- Returns:
- the SHAP values.
-
shap
Returns the average of absolute SHAP values over a data set.- Parameters:
data
- the data set.- Returns:
- the average of absolute SHAP values over a data set.
-