A Beginner’s Brief Guide to Machine Learning Explainability

8 min readAug 29, 2023

When I first started studying machine learning explainability, I found it incredibly difficult to find a single resource that I could use without delving into complicated jargon and relying on hours worth of data science courses. It would have been so helpful to me if I had a simple guide on basic machine learning explainability concepts, how they work, and most importantly, how to apply them.

So today, I’m going to be attempting to circumvent the steep learning curve and offer that simple guide I wish I had to one of the hottest topics in artificial intelligence today.

What is Machine Learning Explainability?

This section deals with a conceptual understanding of machine learning explainability. If you would like to jump into the code, go to the next section.

Let’s start with what machine learning is. According to the Oxford dictionary, machine learning is in short:

“the use and development of computer systems that are able to learn and adapt without following explicit instructions, by using algorithms and statistical models to analyze and draw inferences from patterns in data.”

Before machine learning, programmers would identify a problem, devise an algorithm to solve that problem, and then run a program that implemented that algorithm. Machine learning’s power is in the ability of the program to deduce the algorithm without programmer input.

I like to think of machine learning models as mystery functions.

Consider the following function that squares its input:

f(x) = x^2

There is complete transparency into this function. We know that for every input x, the output of f will be the square of x.

Machine learning models, however, are opaque. Using our function analogy, that means that the function definition is not known. As an example, consider the following function:

g(x) = ?

Because the internal mechanisms of a machine learning model are similarly unknown, some researchers refer to it as a “black box.” We only know the output for a given input, but not how the output is being generated.

Imagine that we know:

g(1) = 1

g(2) = 2

g(3) = 5

g(4) = 18

We have no idea what g(x) is in terms of x.

The purpose of explainability is to remove the cover around machine learning models and shed light on why machine learning models behave the way they do — that is, to demystify the mystery function.

Now, it should be noted that most machine learning models deal with more than one variable. So, instead of g(x), we can have g(x1, x2, … xn), where g is a function of n different variables, called “features”, and the output of g is called the “target”.

What many explainers do is that instead of finding the rule based on which a machine learning model operates, they detect the features that affect the target the most and assign them importance scores. There are many ways in which feature importance scores are assigned, one of the most common of which constitutes the removal-based class of explanations, but I’ll not get into that here.

The utility of machine learning explainability is tremendous. Consider the following scenario.

Scientists at a hospital use a machine learning model to help diagnose diabetes in patients. The model takes into account factors like age, gender, BMI, blood pressure, and cholesterol. The model assigns a diagnosis to each patient: diabetic or non-diabetic. However, there are some cases in which the model fails and falsely diagnoses someone as being diabetic. The scientists wonder what causes the model’s unexpected behavior in those cases.
An explanatory system reveals that the most important factor causing the false positives is age. The scientists therefore remove age from the model’s consideration.

The above is an example of machine learning and machine learning explainability in action.

Explaining Machine Learning Models

That’s it for the theory. Now, for those of you eager to start programming, let’s get started.

Machine learning is closely related to data science. Without data, there is no machine learning. Machine learning models need data to learn from. This data will likely include sample input/feature values and the corresponding output/target value. The machine learning model will “learn” from this training dataset so that it can subsequently predict the target given only the features.

For this tutorial, you will need to download the Python XAISuite package. You can do this by typing

pip install XAISuite

on the command line.

Open a .py file and import the XAISuite library.

from xaisuite import*

Note that it is generally a bad practice to do from _ import* because of memory issues. Here, we are importing everything because we’ll need it :)

I like to divide the machine learning process into four steps:

Data Loading
Data Processing
Model Training
Explanation Generation

Let’s go through them one by one.

Data Loading

First, we need to find our data, and specify the feature and target variables. In the following example, I’m asking XAISuite to create a random regression dataset with 5 variables (don’t worry if you don’t know what classification is). But, you can also import data from your local filesystem, a numpy array, or a pandas DataFrame.

load_data = DataLoader(make_regression)

Now that we’ve loaded the data, let’s see how it looks

load_data.content.head()

Above, we see the first several instances of the dataset. The variable names are the column headers: 0, 1, 2, 3, 4, and “target”. We’ll be using the values of 0, 1, 2, 3, and 4 to predict the value of “target”. Because the variable names can be a bit confusing, we are going to rename them.

load_data = DataLoader(make_regression, variable_names = ["x1", "x2", "x3", "x4", "x5", "x6"], n_features = 5)
load_data.content.head()

You should see something like this. (My values are different from last time because I regenerated the data.)

For the sake of simplicity, I’m not going to introduce any categorical variables here.

Now, you might have noticed that the “target” column is gone because we renamed the variables. It is up to us what variable we want to choose as our target. I’m arbitrarily choosing x5 as our target variable. We need to specify this choice to the DataLoader. Our final code is therefore:

load_data = DataLoader(make_regression, variable_names = ["x1", "x2", "x3", "x4", "x5", "x6"], target_names = "x5", n_features = 5)

You can now visualize our data with the seaborn library:

import seaborn as sns
import matplotlib.pyplot as plt
sns.kdeplot(load_data.content)
plt.xlim(-5, 5)
plt.show()

2. Data Processing

For machine learning models to perform optimally, data scientists often use “transforms” to make data more machine learning-friendly. Examples include sklearn.preprocessing’s Standard Scaler, which normalizes the data, or OneHotEncoder, which converts categorical labels into continuous values using 0s and 1s.

The XAISuite package automatically optimally processes the data by default, so we don’t need to worry about that (although if you wanted to you could use custom processing functions with XAISuite).

One thing we need to keep in mind, however, is the train-test-split. Data scientists tend to divide machine learning datasets into two parts: the training part, which the model uses to learn, and the testing part which helps evaluate the model’s performance by checking if model predictions are in line with the correct values. By default, the train_test_split in XAISuite is 0.2, that is, 80% of the data will be used for training and 20% for testing.

Let’s get XAISuite to process our data!

process_data = DataProcessor(load_data)

3. Model Training

Now, the fun part! We get to train our model. There are so many models to choose from. I’m going to go with a DecisionTreeRegressor (you don’t need to know how it works for this example).

from sklearn.tree import DecisionTreeRegressor
train_model = ModelTrainer(DecisionTreeRegressor, process_data, explainers = [])

The output is

Model score is 0.7790724402568139

Not bad, but not good either. Generally we want our model score to be between 0.9 and 1. We can do that by adding optimizing parameters:

from sklearn.tree import DecisionTreeRegressor
train_model = ModelTrainer(DecisionTreeRegressor, process_data, explainers = [], ccp_alpha = 0.5)

Now, the output is:

Model score is 0.815467624921841

Machine learning optimization merits a whole other article, so I am not going to delve into that here.

Note that you don’t need to import sklearn.tree.DecisionTreeRegressor. Just passing in the string “DecisionTreeRegressor” should suffice in XAISuite version 2.0+.

4. Explanation generation:

Notice that, in the previous step, we left the explainers list blank. Now we can add some explainers to gain some insight into our Decision Tree Regressor Model.

Several explanation algorithms exist. Two of the most well known are LIME (Local Interpretable Model-Agnostic Explanations) and SHAP (based on Shapley values in game theory).

Let’s see what LIME and SHAP give us!

from sklearn.tree import DecisionTreeRegressor
train_model = ModelTrainer(DecisionTreeRegressor, process_data, explainers = {"lime": {"feature_selection": "none”}, “shap”: {}}, ccp_alpha = 0.5)

Here, we’re passing in feature_selection = “none” to the lime kernel so that we get the importance of all features in the dataset.

Now, to retrieve the explanations, we can do something like:

explanations = train_model.getExplanationsFor([])

to get all the explanations or

explanations = train_model.getExplanationsFor([0])

to get the explanations only for the first instance.

To view the explanations, we simply do

explanations[“lime”].ipython_plot(index)

and

explanations[“shap”].ipython_plot(index)

where index is the number of the instance for which explanations are desired. So, for example,

explanations[“lime”].ipython_plot(10)

creates the following visual:

This image shows us that x4 is the most import feature for Instance 10, and it reduces the value of the model’s prediction.

Bonus: We can also compare explanations with XAISuite to see where different explainers disagree on the model’s internal mechanisms. Check this out:

insights = InsightGenerator(explanations)
similarity = insights.calculateExplainerSimilarity("lime", "shap")
print(similarity)

The output is 0.715. An explainer similarity of 1 indicates that the two explainers are identical and 0 indicates no similarity, so this is actually pretty good.

Conclusion

In this post, I have endeavored to cram an entire machine learning explainability introductory course into a few pages. If something is not clear or incorrect, please let me know in the comments. Otherwise, I wish you good luck on your machine learning journey. I hope that you now have all the information at your disposal to dive right in to harnessing machine learning’s incredible power.

A Beginner’s Brief Guide to Machine Learning Explainability

What is Machine Learning Explainability?

Explaining Machine Learning Models

Conclusion

Written by Shreyan M Mitra

No responses yet