Interpreting deep neural networks with sample sets
Abstract
Despite their impressive performances on a range of widespread tasks, deep neural networks
(DNNs) are generally considered `black box' models due to the lack of transparency
behind their decision-making processes. Researchers address this issue through the use of
interpretability techniques which, in the context of this study, uses some set of rules to
map the output of the network back onto its inputs.
In recent works, sample set analysis has been proposed as a novel methodology to better
study the generalisation capabilities of DNNs through analysing the natural sample
clusters formed by the network itself. By being able to directly identify the nodes that
process the largest number of class samples, this methodology does o er some potential
as a possible means for improving DNN interpretations.
In this exploratory study, we investigate the applicability of sample set analysis as a tool
for DNN interpretability purposes. We do this by analysing the inner workings of networks
trained on the MNIST data set through using sample set analysis in conjunction with the
Layer-wise Relevance Propagation (LRP) interpretability technique, while verifying the
results using a custom generated synthetic data set.
Our analysis led to the introduction of encoding sample sets, an additional sample set
category that groups class samples according to their binary node activation patterns in
a given layer. Through encoding sample sets, we further introduce the concepts of core
and variation nodes, which refer to the nodes that activates for all encoding sample sets
within a layer or only a subset of them, respectively.
When used in conjunction with LRP, encoding sample sets are capable of generating interpretations
which represent groups of samples rather then representing them individually.
We coined this approach set interpretations and found that it provides interpretations
highly similar to its individual counterparts while simplifying the interpretation process.
Collections
- Engineering [1422]