Applications for machine learning and deep learning have become increasingly accessible. For example, Keras provides APIs with TensorFlow backend that enable users to build neural networks without being fluent with TensorFlow. Despite the ease of building and testing models, deep learning has suffered from a lack of interpretability; deep learning models are considered black boxes to many users. In a talk at ODSC West in 2018, Pramit Choudhary explained the importance of model evaluation and interpretability in deep learning and some cutting edge techniques for addressing it.
[Related Article: Deep Learning for Speech Recognition]
Predictive accuracy is not the only concern regarding a model’s performance. In many cases, it is critical for data scientists to be able to understand why the models makes the predictions it does. Applications include describing model decisions to business executives, identify blind spots to resist adversarial attacks, comply with data protection regulations, and/or provide justification for customer classification. Pramid Choudhary explained that there are two levels of interpretation, global and local. Global interpretation is understanding the conditional interaction of features and target variables with respect to the entire dataset, while local interpretation involves understanding the same relationship, but for one data point.
Deep neural networks like the convolutional neural network Inception-v4 are highly parameterized with multiple layers and various functions, so they are difficult to interpret. While individual parameter values in linear regression can offer suitable interpretability of model function, there are vastly too many parameters in deep neural networks for a similar approach. Despite the complexity, Pramit Choudhary presented a number of techniques to interrogate convolutional neural networks.
Pramit explained that one method for global visualization of neural networks is to feed images through the network for feature extraction, reduce the dimensionality of the features, and plot them in two-dimensional space. Dimensionality reduction can be achieved with principal components analysis (PCA) that finds linear relationships among the many features, but a preferred technique is to use t-distributed stochastic neighbor embedding (t-SNE) that offers non-linear dimensionality reduction. The figure below shows how t-SNE can help visualize model predictions relative to each other using the MNIST dataset of handwritten numbers.
The downside to visualizing predictions in PCA space or using t-SNE is that one has to infer how the model is deciphering various shapes. A more direct way is to visualize the activation layers during a forward pass, enabling one to see the kind of features each layer identifies. Early layers pick up pixel-scale patterns, while layers closer to the terminal end identify more spatial information, evident below.
Additionally, we may be interested in what pixels in a given image are associated with a certain prediction. Occlusion is a perturbation based inference algorithm that occludes rolling segments of the image and measures the difference between original and new output predictions. In the example below, red indicates areas that are important to a positive prediction (the predicted class) and pixels key to a negative prediction (any other class) in blue. The background is key to the predicted class in this example (not a great sign if the model is trying to predict animal type).
By a similar fashion, saliency maps show pixel importance by changing pixel values slightly and measuring the gradient of the output over the gradient of the input. Pixels in the image that exhibit a large gradient are key to the prediction. To learn how to implement saliency maps in Python, check out this blog post. One of the key drawbacks from both saliency maps and occlusion algorithms is that they are very computationally expensive, so they are more useful for local interpretation.
[Related Article: Deep Learning Research in 2019: Part 2]
Pramit Choudhary introduced a few additional techniques for interpreting convolutional neural networks with a similar concept of measuring the change in output given changes to the input. Using the original model, one can evaluate how the model makes decisions on a local scale and optimize one’s approach to building an effective architecture. In his summary, Pramit stated that interpretation of deep learning models is very context-dependent; not all techniques will be relevant for every domain. Pramit concluded that more research is necessary for evaluating different aspects of interpretability and scaling interpretation algorithms efficiently.