Responsible AI: Interpret-Text with the Introspective Rationale Explainer

Responsible AI: Interpret-Text with the Introspective Rationale Explainer

May 19, 2022 Artificial Intelligence 0

In the previous post you got an overview about interpretability, and the different explainers available in the Interpret-Text tool. 
In this post you will get an understanding of how to use one of the explainers: Introspective Rationale Explainer.

To generate an outstanding text fragment of important features for training a classification model, Introspective Rationale Explainer uses a generator-predictor framework. This tool predicts the labels and organizes the result, whether the words are useful (rationales) or should not be used for training (anti-rationales).

The API is designed to be modular and extensible, and can be used when a Bidirectional Encoder Representations from Transformers (BERT) or a Recurrent Neural Network (RNN) model needs to be explained. If the developer wants to define a personalized model, the pre-processor, the predictor and the generator modules should be provided by the developer.

By reading this post, you get an overview about how to use the explainer, and you will understand how the model works and what is the meaning of the results of the different measures.

Setting up the environment

To follow the tutorial, please install Anaconda with Python 3.7, and open Anaconda Prompt. This can be done on a computer locally, or on a virtual machine. Get the repository to get started and move to the folder that is created.

git clone
cd interpret-text

The next step is to prepare the environment to include all the necessary packages.

For CPU:
python tools/
conda env create -n interpret_cpu --file=interpret_cpu.yaml
conda activate interpret_cpu
For GPU:
python tools/ --gpu
conda env create -n interpret_gpu --file=interpret_gpu.yaml
conda activate interpret_gpu

Python packages need to be installed as well together with a widget that is required for dashboarding.

cd pythonpip install -e .
jupyter nbextension install interpret_text.experimental.widget --py --sys-prefix
jupyter nbextension enable interpret_text.experimental.widget --py --sys-prefix

And finally install the notebook where the Python code will run, then start the notebook in your favorite browser.

pip install notebook
jupyter notebook

The environment now should look something like this:


Create a new notebook with Python 3, by using the NEW button on the top right corner of the screen. The user should see an environment like on the picture below.

The Notebook environment, the configurations and the working directory is defined with the following code snippet. Do not forget about using matplotlib, otherwise some graphics might not work perfectly. It is possible to decrease the runtime, by setting the QUICK RUN to True. Note that doing so will affect the performance of the model, it skips over embedding, and most of the evaluation. The Introspective Rationale Explainer supports either RNN or BERT, or the combination of these. This configuration defines that BERT is not going to be used.

%matplotlib inline
import sys
import os
from notebooks.test_utils.utils_data_shared import load_glove_embeddings
# training procedure parameters
load_pretrained_model = False
pretrained_model_path = "../models/rnn.pth"
MODEL_SAVE_DIR = os.path.join("..", "models")
model_prefix = "sst2rnpmodel"
CUDA = False
model_config = {
"cuda": CUDA,
"model_save_dir": MODEL_SAVE_DIR,
"model_prefix": model_prefix,
"lr": 2e-4
model_config["save_best_model"] = False
model_config["pretrain_cls"] = True
model_config["num_epochs"] = 1
DATA_FOLDER = "../../../data/sst2"
if not QUICK_RUN:
model_config["embedding_path"] = load_glove_embeddings(DATA_FOLDER)
model_config["embedding_path"] = os.path.join(DATA_FOLDER, "")

The next step is to import the dataset for training, using a predefined function that extracts the data and builds a Pandas Dataframe which is now ready to use.

from notebooks.test_utils.utils_sst2 import load_sst2_pandas_df
import numpy as np
import pandas as pd
train_data = load_sst2_pandas_df('train')
test_data = load_sst2_pandas_df('test')
all_data = pd.concat([train_data, test_data])

Some variables are defined here that will be used for training and testing, including the number of labels.

batch_size = 50
train_data = train_data.head(batch_size)
test_data = test_data.head(batch_size)
X_train = train_data["sentences"]
X_test = test_data["sentences"]
# get all unique labels
y_labels = all_data["labels"].unique()
model_config["labels"] = np.array(sorted(y_labels))
model_config["num_labels"] = len(y_labels)

Here is a fragment of the dataset, showing the labels (sentiment) and the sentences.


It is a good idea to have a balanced dataset, to provide the same amount of examples for each labels before training, otherwise the training result will also be skewed.


It is now visible that the data is sort of balanced, let’s prepare it for training. The GloVe (Global Vectors for Word Representation) pre-processor will tokenize and embed the data. The labels then get appended to the output of the tokenizer.

from interpret_text.experimental.common.preprocessor.glove_preprocessor import GlovePreprocessor
# data processing parameters
token_count_thresh = 1
max_sentence_token_count = 70
preprocessor = GlovePreprocessor(token_count_thresh, max_sentence_token_count)
preprocessor = GlovePreprocessor(token_count_thresh, max_sentence_token_count)
preprocessor = BertPreprocessor()

df_train = pd.concat([train_data["labels"], preprocessor.preprocess(X_train)], axis=1)
df_test = pd.concat([test_data["labels"], preprocessor.preprocess(X_test)], axis=1)

As a next step, the explainer is initialized, and the pre-processor is set up. The model configuration is also passed to the explainer, and the training can start. The aim of this classifier is to identify the sentiment of each sentences.

from interpret_text.experimental.introspective_rationale import IntrospectiveRationaleExplainer
explainer = IntrospectiveRationaleExplainer(classifier_type=MODEL_TYPE, cuda=CUDA)
explainer.load(), df_test)

As a result of this code, the details of the model can be reviewed, and it also returns the configuration details for each layers of the classification model.

The model is now ready for testing, by running the following code, scoring will return the sparsity, the accuracy and the anti accuracy.

accuracy = explainer.model.avg_accuracy
print("Test accuracy: ", accuracy, "%")

To understand accuracy, it is good to know how the confusion matrix works, since it can be calculated from the parameters defined by the matrix. Accuracy shows the total correctly predicted observations out of all the observations. It is very important to understand what happens behind the scenes, how the model decides based on the different words, which label to assign to the sentence.


It is clear now how the model is built up, how the layers are configured. Let’s take a look at the data, what are the important features that has an effect on the prediction.

This is an interactive widget that shows whether the specific word has an effect on the prediction (positive features) or not (negative features). To generate and view this dashboard, run the following code. Let’s get an explanation, of the following sentence, and build a dashboard of the result.

s1 = "Beautiful movie ; really good , the popcorn was bad"
local_explanation = explainer.explain_local(s1)
from interpret_text.experimental.widget import ExplanationDashboard

You should see now the dashboard. With the slider on the top of the dashboard, the number of important features shown can be set. On the right side, the user can see the label given to the provided document. Under this, the user can choose to see all the features or only the positive or the negative ones. By hovering on the bars in the graph, the user can see the importance values.


In this post, users got the chance to pre-process the data, train and fine-tune a BERT model that is able to predict the sentiment of a specific sentence, and explain how the pipeline works behind the scenes.

Users get a good overview of how the model uses the data that is provided, which allows them to improve the pipeline easier. In this post we learned about the explainer, built a training pipeline and after evaluation, generated a dashboard to understand, which words have an impact on the predicted label.

Continue the experimentation with other explainers:


GitHub: Interpret-Text

Leave a Reply

Your email address will not be published. Required fields are marked *