[XAI 5] - Model Interpretability Using SHAP
SHapley Additive exPlanation (SHAP) is another popular choice of explainability technique in ML and, in certain scenarios, is more effective than LIME.
I. An intuitive understanding of the SHAP and Shapley values
1. Introduction to SHAP and Shapley values
The SHAP framework was introduced by Scott Lundberg and Su-In Lee in their research work, A Unified Approach of Interpreting Model Predictions. This was published in 2017. SHAP is based on the concept of Shapley values from cooperative game theory, but unlike the LIME framework, it considers additive feature importance. By definition, the Shapley value is the mean marginal contribution of each feature value across all possible values in the feature space. The mathematical understanding of Shapley values is complicated and might confuse most readers. That said, if you are interested in getting an in-depth mathematical understanding of Shapley values, we recommend that you take a look at the research paper called “A Value for n-Person Games.” Contributions to the Theory of Games 2.28 (1953), by Lloyd S. Shapley. In the next section, we will gain an intuitive understanding of Shapley values with a very simple example.
2. What are Shapley values?
In this section, I will explain Shapley values using a very simple and easy-to-understand example. Let’s suppose that Alice, Bob, and Charlie are three friends who are taking part, as a team, in a Kaggle competition to solve a given problem with ML, for a certain cash prize. Their collective goal is to win the competition and get the prize money. All three of them are equally not good in all areas of ML and, therefore, have contributed in different ways. Now, if they win the competition and earn their prize money, how will they ensure a fair distribution of the prize money considering their individual contributions? How will they measure their individual contributions for the same goal? The answer to these questions can be given by Shapley values, which were introduced in 1951 by Lloyd Shapley.
The following diagram gives us a visual illustration of the scenario:
So, in this scenario, Alice, Bob, and Charlie are part of the same team, playing the same game (which is the Kaggle competition). In game theory, this is referred to as a Coalition Game . The prize money for the competition is their payout . So, Shapley values tell us the average contribution of each player to the payout ensuring a fair distribution. But why not just equally distribute the prize money between all the players? Well, since the contributions are not equal, it is not fair to distribute the money equally.
a. Deciding the payouts
Now, how do we decide the fairest way to distribute the payout? One way is to assume that Alice, Bob, and Charlie joined the game in a sequence in which Alice started first, followed by Bob, and then followed by Charlie. Let’s suppose that if Alice, Bob, and Charlie had participated alone, they would have gained 10 points, 20 points, and 25 points, respectively. But if only Alice and Bob teamed up, they might have received 40 points. While Alice and Charlie together could get 30 points, Bob and Charlie together could get 50 points. When all three of them collaborate together, only then do they get 90 points, which is sufficient for them to win the competition.
Figure below illustrates the point values for each condition. We will make use of these values to calculate the average marginal contribution of each player:
Mathematically, if we assume that there are N players, where $S$ is the coalition subset of players and $𝑣(𝑆)$ is the total value of $S$ players, then by the formula of Shapley values, the marginal contribution of player i is given as follows:
The equation of Shapley value might look complicated, but let’s simplify this with our example. Please note that the order in which each player starts the game is important to consider as Shapley values try to account for the order of each player to calculate the marginal contribution.
Now, for our example, the contribution of Alice can be calculated by calculating the difference that Alice can cause to the final score. So, the contribution is calculated by taking the difference in the points scored when Alice is in the game and when she is not. Also, when Alice is playing, she can either play alone or team up with others. When Alice is playing, the value that she can create can be represented as $𝑣(𝐴)$. Likewise, $𝑣(𝐵)$ and $𝑣(𝐶)$ denote individual values created by Bob and Charlie. Now, when Alice and Bob are teaming up, we can calculate only Alice’s contribution by removing Bob’s contribution from the overall contribution. This can be represented as $𝑣(𝐴, 𝐵)– 𝑣(𝐵)$. And if all three are playing together, Alice’s contribution is given as $𝑣(𝐴, 𝐵, 𝐶)– 𝑣(𝐵, 𝐶)$.
Considering all possible permutations of the sequences by which Alice, Bob, and Charlie play the game, the marginal contribution of Alice is the average of her individual contributions in all possible scenarios. This is illustrated in Figure below:
So, the overall contribution of Alice will be her marginal contribution across all possible scenarios, which also happens to be the Shapley value. For Alice, the Shapley value is 20.83. Similarly, we can calculate the marginal contribution for Bob and Charlie, as shown in Figure below:
I hope this wasn’t too difficult to understand! One thing to note is that the sum of marginal contributions of Alice, Bob, and Charlie should be equal to the total contribution made by all three of them together. Now, let’s try to understand Shapley values in the context of ML.
3. Shapley values in ML
In order to understand the importance of Shapley values in ML to explain model predictions, we will try to modify the example about Alice, Bob, and Charlie that we used for understanding Shapley values. We can consider Alice, Bob, and Charlie to be three different features present in a dataset used for training a model . So, in this case, the player contributions will be the contribution of each feature. The game or the Kaggle competition will be the black-box ML model and the payout will be the prediction. So, if we want to know the contribution of each feature toward the model prediction, we will use Shapley values.
Therefore, Shapley values help us to understand the collective contribution of each feature toward the outcome predicted by black-box ML models. By using Shapley values, we can explain the working of black-box models by estimating the feature contributions.
Properties of Shapley values
Now that we have an intuitive understanding of Shapley values and we have learned how to calculate Shapley values, we should also gain an understanding of the properties of Shapley values:
Efficiency : The total sum of Shapley values or the marginal contribution of each feature should be equal to the value of the total coalition.
Symmetry : Each player has a fair chance of joining the game in any order.
Dummy : If a particular feature does not change the predicted value regardless of the coalition group, then the Shapley value for the feature is 0.
Additivity : For any game with a combined payout, the Shapley values are also combined. This is denoted as $𝜑(𝑣 + 𝑤) = 𝜑(𝑣) + 𝜑(𝑤)$, then $(𝑣 + 𝑤)(𝑆) = 𝑣(𝑆) + 𝑤(𝑆)$. For example, for the random forest algorithm in ML, Shapley values can be calculated for a particular feature by calculating it for each individual tree and then averaging them to find the additive Shapley value for the entire random forest.
So, these are the important properties of Shapley values. Next, let’s discuss the SHAP framework and understand how it is much more than just the usage of Shapley values.
4. The SHAP framework
Previously, we discussed what Shapley values are and how they are used in ML. Now, let’s cover the SHAP framework. Although SHAP is popularly used as an XAI tool for providing local explainability to individual predictions, SHAP can also provide a global explanation by aggregating the individual predictions. Additionally, SHAP is model-agnostic, which means that it does not make any assumptions about the algorithm used in black-box models. The creators of the framework broadly came up with two model-agnostic approximation methods, which are as follows:
- SHAP Explainer : This is based on Shapley sampling values.
- KernelSHAP Explainer : This is based on the LIME approach.
The framework also includes model-specific explainability methods such as the following:
- Linear SHAP : This is for linear models with independent features.
- Tree SHAP : This is an algorithm that is faster than SHAP explainers to compute SHAP values for tree algorithms and tree-based ensemble learning algorithms.
- Deep SHAP : This is an algorithm that is faster than SHAP explainers to compute SHAP values for deep learning models.
Apart from these approaches, SHAP also uses interesting visualization methods to explain AI models. We will cover these methods in more detail in the next section. But one point to note is that the calculation of Shapley values is computationally very expensive, and the algorithm is of the order of $𝑂(2^𝑛)$, where n is the number of features. So, if the dataset has many features, calculating Shapley values might take forever! However, the SHAP framework uses an approximation technique to calculate Shapley values efficiently. The explanation provided by SHAP is more robust as compared to the LIME framework. Let’s proceed to the next section, where we will discuss the various model explainability approaches used by SHAP on various types of data.
II. Model explainability approaches using SHAP
After reading the previous section, you have gained an understanding of SHAP and Shapley values. In this section, we will discuss various model-explainability approaches using SHAP. Data visualization is an important method to explain the working of complex algorithms. SHAP makes use of various interesting data visualization techniques to represent the approximated Shapley values to explain black-box models. So, let’s discuss some of the popular visualization methods used by the SHAP framework.
1. Visualizations in SHAP
As mentioned previously, SHAP can be used for both the global interpretability of the model and the local interpretability of the inference data instance. Now, the values generated by the SHAP algorithm are quite difficult to understand unless we make use of intuitive visualizations. The choice of visualization depends on the choice of global interpretability or local interpretability, which we will cover in this section.
a. Global interpretability with feature importance bar plots
Analyzing the most influential features present in a dataset always helps us to understand the functioning of an algorithm with respect to the underlying data. SHAP provides an effective way in which to find feature importance using Shapley values. So, the feature importance bar plot displays the important features in descending order of their importance. Additionally, SHAP provides a unique way to show feature interactions using hierarchical clustering. These feature clustering methods help us to visualize a group of features that collectively impacts the model’s outcome. This is very interesting since one of the core benefits of using Shapley values is to analyze the additive influence of multiple features together. However, there is one drawback of the feature importance plot for global interpretability. Since this method only considers mean absolute Shapley values to estimate feature importance, it doesn’t show whether collectively certain features are impacting the model in a negative way or not.
The following diagram illustrates the visualizations for a feature importance plot and a feature clustering plot using SHAP:
Next, let’s explore SHAP Cohort plots.
b. Global interpretability with the Cohort plot
Sometimes, analyzing subgroups of data is an important part of data analysis. SHAP provides a very interesting way of grouping the data into certain defined cohorts to analyze feature importance. I found this to be a unique option in SHAP, which can be really helpful! This is an extension of the existing feature importance visualization, and it highlights feature importance for each of the cohorts for a better comparison.
c. Global interpretability with heatmap plots
To understand the overall impact of all features on the model at a more granular level, heatmap visualizations are extremely useful. The SHAP heatmap visualization shows how every feature value can positively or negatively impact the outcome. Additionally, the plot also includes a line plot to show how the model prediction varies with the positive or negative impact of feature values. However, for non-technical users, this visualization can be really challenging to interpret. This is one of the drawbacks of this visualization method.
Figure below illustrates a SHAP heatmap visualization:
Another popular choice of visualization for global interpretability using SHAP is summary plots. Let’s discuss summary plots in the next section.
d. Global interpretability with summary plots
A summary plot is another visualization method in SHAP for providing global explainability of black-box models. It is a good replacement for the feature importance plot, which not only includes the important features but also the range of effects of these features present in the dataset. The color bar indicates the impact of the features. The features that influence the model’s outcome in a positive way are highlighted in a particular color, whereas the features that impact the model’s outcome negatively are represented in another contrasting color. The horizontal violin plot for each feature shows the distribution of the Shapley values of the features for each data instance.
The following screenshot illustrates a SHAP summary plot:
In the next section, we will discuss SHAP dependence plots.
e. Global interpretability with dependence plots
In certain scenarios, it is important to analyze interactions between the features and how this interaction influences the model outcome. So, SHAP feature dependence plots show the variation of the model outcome by specific features. This plot can help to pick up interesting interaction patterns or trends between the feature values. The features used for selecting the color map are automatically picked up by the algorithm, based on the interaction with a specific selected feature.
In this example, the selected feature is pH, and the feature used for selecting the colormap is alcohol. So, the plot tells us that with an increase in pH, the alcohol value also increases. This will be covered in greater detail in the next section. In the next section, let’s explore the SHAP visualization methods used for local explainability
f. Local interpretability with bar plots
So far, we have covered various visualization techniques offered by SHAP for providing a global overview of the model. However, similar to LIME, SHAP is also model-agnostic that is designed to provide local explainability. SHAP provides certain visualization methods that can be applied to inference data for local explainability. Local feature importance using SHAP bar plots is one such local explainability method. This plot can help us analyze the positive and negative impact of the features that are present in the data. The features that impact the model’s outcome positively are highlighted in one color (pinkish-red by default), and the features having a negative impact on the model outcome are represented using another color (blue by default). Also, as we have discussed before, if the Shapley value is zero for any feature, this indicates that the feature does not influence the model outcome at all. Additionally, the bar plot centers at zero to show the contribution of the features present in the data.
The following diagram shows a SHAP feature importance bar plot for local interpretability:
Next, let’s cover another SHAP visualization that is used for local interpretability.
g. Local interpretability with waterfall plots
Bar charts are not the only visualization provided by SHAP for local interpretability. The same information can be displayed using a waterfall plot, which might look more attractive. Perhaps, the only difference is that waterfall plots are not centered at zero, whereas bar plots are centered at zero. Otherwise, we get the same feature importance based on Shapley values and the positive or the negative impact of the specific features on the model outcome.
Figure below illustrates a SHAP waterfall plot for local interpretability:
Next, we will discuss force plot visualization in SHAP.
h. Local interpretability with force plot
We can also use force plots instead of waterfall or bar plots to explain local inference data. With force plots, we can see the model prediction, which is denoted by f(x). The base value in the following diagram represents the average predicted outcome of the model. The base value is actually used when the features present in the local data instance are not considered. So, using the force plot, we can also see how far the predicted outcome is from the base value. Additionally, we can see the feature impacts as the visual highlights certain features that try to increase the model prediction along with other important features that have a negative influence on the model as it tries to lower the prediction value.
So, Figure below illustrates a sample force plot visualization in SHAP:
Although force plots might visually look very interesting, we would recommend using bar plots or waterfall plots if the dataset contains many features affecting the model outcome in a positive or a negative way.
i. Local interpretability with decision plots
The easiest way to explain something is by comparing it with a reference value. So far, in bar plots, waterfall plots, and even force plots, we do not see any reference values for the underlying features used. However, in order to find out whether the feature values are positively or negatively influencing the model outcome, the algorithm is actually trying to compare the feature values of the inference data with the mean of the feature values of the trained model. So, this is the reference value that is not displayed in the three local explainability visualization plots that we covered. But SHAP decision plots help us to compare the feature values of the local data instance with the mean feature values of the training dataset. Additionally, decision plots show the deviation of the feature values, the model prediction, and the direction of deviation of features from the reference values. If the direction of deviation is toward the right, this indicates that the feature is positively influencing the model outcome; if the direction of deviation is toward the left, this represents the negative influence of the feature on the model outcome. Different colors are used to highlight positive or negative influences. If there is no deviation, then the features are actually not influencing the model outcome.
The following diagram illustrates the use of decision plots to compare two different data instances for providing local explainability:
So far, you have seen the variety of visualization methods provided in SHAP for the global and local explainability of ML models. Now, let’s discuss the various types of explainers in SHAP.
2. Explainers in SHAP
In the previous section, we looked at how the data visualization techniques that are available in SHAP can be used to provide explainability. But the choice of the visualization method might also depend on the choice of the explainer algorithm. As we discussed earlier, SHAP provides both model-specific and model-agnostic explainability. But the framework has multiple explainer algorithms that can be applied with different models and with different types of datasets. In this section, we will cover the various explainer algorithms provided in SHAP.
a. TreeExplainer
TreeExplainer is a fast implementation of the Tree SHAP algorithm for computing Shapley values for trees and tree-based ensemble learning algorithms. The algorithm makes many diverse possible assumptions about the feature dependence of the features present in the dataset. Only tree-based algorithms are supported such as Random Forest, XGBoost, LightGBM, and CatBoost. The algorithm relies on fast C++ implementations in either the local compiled C extension or inside an external model package, but it is faster than conventional Shapley value-based explainers. Generally, it is used for tree-based models trained on structured data for both classification and regression problems
b. DeepExplainer
Similar to LIME, SHAP can also be applied to deep learning models trained on unstructured data, such as images and texts. SHAP uses DeepExplainer, which is based on the Deep SHAP algorithm for explaining deep learning models. The DeepExplainer algorithm is designed for deep learning models to approximate SHAP values. The algorithm is a modified version of the DeepLIFT algorithm. The developers of the framework have mentioned that the implementation of the Deep SHAP algorithm differs slightly from the original DeepLIFT algorithm. It uses a distribution of background samples rather than a single reference value. Additionally, the Deep SHAP algorithm also uses Shapley equations to linearize computations such as products, division, max, softmax, and more. The framework mostly supports deep learning frameworks such as TensorFlow, Keras, and PyTorch.
c. GradientExplainer
DeepExplainer is not the only explainer in SHAP that can be used with deep learning models. GradientExplainer can also work with deep learning models. The algorithm explains models using the concept of expected gradients . The expected gradient is an extension of Integrated Gradients, SHAP, and SmoothGrad, which combines the ideas of these algorithms into a single expected value equation. Consequently, similar to DeepExplainer, the entire dataset can be used as the background distribution sample instead of a single reference sample. This allows the model to be approximated with a linear function between individual samples of the data and the current input data instance that is to be explained. Since the input features are assumed to be independent, the expected gradients will calculate the approximate SHAP values.
For model explainability, the feature values with higher SHAP values are highlighted, as these features have a positive contribution toward the model’s outcome. For unstructured data such as images, pixel positions that have the maximum contribution toward the model prediction are highlighted. Usually, GradientExplainer is slower than DeepExplainer as it makes different approximation assumptions.
The following diagram shows a sample GradientExplainer visualization for the local explainability of a classification model trained on images:
Next, let’s discuss SHAP KernelExplainers.
d. KernelExplainer
KernelExplainers in SHAP use the Kernel SHAP method to provide model-agnostic explainability. To estimate the SHAP values for any model, the Kernel SHAP algorithm utilizes a specifically weighted local linear regression approach to compute feature importance. The major difference between Kernel SHAP and LIME is the approach that is adopted to assign weights to the instances in a regression model. In LIME, the weights are assigned based on how close the local data instances are to the original instance. Whereas in Kernel SHAP, the weights are assigned based on the estimated Shapley values of the coalition of features used. In simple words, LIME assigns weights based on isolated features, whereas SHAP considers the combined effect of the features for assigning the weights. KernelExplainer is slower than model-specific algorithms as it does not make any assumptions about the model type.
e. LinearExplainer
SHAP LinearExplainer is designed for computing SHAP values for linear models to analyze inter-feature correlations. LinearExplainer also supports the estimation of the feature covariance matrix for coalition feature importance. However, finding feature correlations for a high-dimensional dataset can be computationally expensive. But LinearExplainers are fast and efficient as they use sampling to estimate a transformation. This is then used for explaining any outcome of linear models.
Therefore, we have discussed the theoretical aspect of various explainers in SHAP. For more information about these explainers, I do recommend checking out. In the next chapter, we will cover the practical implementation of the SHAP explainers using the code tutorials on GitHub in which we will implement the SHAP explainers for explaining models trained on different types of datasets. In the next section, we will cover a practical tutorial on how to use SHAP for explaining regression models to give you a glimpse of how to apply SHAP for model explainability.
III. Using SHAP to explain regression models
V. Advantages and limitations of SHAP
In the previous section, we discussed the practical application of SHAP for explaining a regression model with just a few lines of code. However, since SHAP is not the only explainability framework, we should be aware of the specific advantages and disadvantages of SHAP, too.
1. Advantages
The following is a list of some of the advantages of SHAP:
Local explainability : Since SHAP provides local explainability to inference data, it enables users to analyze key factors that are positively or negatively affecting the model’s decision-making process. As SHAP provides local explainability, it is useful for production-level ML systems, too.
Global explainability : Global explainability provided in SHAP helps to extract key information about the model and the training data, especially from the collective feature importance plots. I think SHAP is better than LIME for getting a global perspective on the model. SP-LIME in LIME is good for getting an example-driven global perspective of the model, but I think SHAP provides a generalized global understanding of trained models.
Model-agnostic and model-specific : SHAP can be model-agnostic and modelspecific. So, it can work with black-box models and also work with complex deep learning models to provide explainability.
Theoretical robustness : The concept of using Shapley values for model explainability, which is based on the principles of coalition game theory, captures feature interaction very well. Also, the properties of SHAP regarding efficiency, symmetry, dummy , and additivity are formulated on a robust theoretical foundation. Unlike SHAP, LIME is not based on a solid theory as it assumes ML models will behave linearly for some local data points. But there is not much theoretical evidence that proves why this assumption is true for all cases. That is why I would say SHAP is based on ideas that are theoretically more robust than LIME.
These advantages make SHAP one of the most popular choices of the XAI framework. Unfortunately, applying SHAP can be really challenging for high-dimensional datasets as it does not provide actionable explanations. Let’s look at some of the limitations of SHAP.
2. Limitations
Here is a list of some of the limitations of SHAP:
SHAP is not the preferred choice for high-dimensional data : Computing Shapley values on high-dimensional data can be computationally more challenging, as the time complexity of the algorithm is 𝑂𝑂(2𝑛𝑛), where n is the total number of features in the dataset.
Shapley values are ineffective for selective explanation : Shapley values try to consider all the features for providing explainability. The explanations can be incorrect for sparse explanations, in which only selected features are considered. But usually, human-friendly explanations consider selective features. So, I would say that LIME is better than SHAP when you seek a selective explanation. However, more recent versions of the SHAP framework do include the same ideas as LIME and can be almost equally effective for sparse explanations.
SHAP cannot be used for prescriptive insights : SHAP computes the Shapley values for each feature and does not build a prediction model like LIME. So, it cannot be used for analyzing any what-if scenario or for providing any counterfactual example for suggesting actionable insights.
KernelSHAP can be slow : Although KernelSHAP is model-agnostic, it can be very slow and, thus, might not be suitable for production-level ML systems for models trained on high-dimensional data.
Not extremely human-friendly : Apart from analyzing feature importance through feature interactions, SHAP visualizations can be complicated to interpret for any non-technical user. Often, non-technical users prefer simple selective actionable insights, recommendations, or justifications from ML models. Unfortunately, SHAP requires another layer of abstraction for human-friendly explanations when used in production systems.
As we can see from the points discussed in this section, SHAP might not be the most ideal framework for explainability, and there is a lot of space for improvement to make it more human-friendly. However, it is indeed an important and very useful framework for explaining black-box algorithms, especially for technical users. This brings us to the end of the part.