Educational Resource

Explainability & Interpretability
in Medical Imaging AI

Understanding how AI systems make decisions is essential for anyone working with AI in healthcare — whether as a researcher, clinician, regulator, or student.

Introduction

As artificial intelligence becomes increasingly integrated into medical imaging workflows, two terms appear constantly in the conversation: explainability and interpretability. While they are often used interchangeably, they describe fundamentally different approaches to understanding how AI systems make decisions.

Interpretability

Interpretability means you can directly understand how a model works internally. The model's structure is simple enough that a human can trace the path from input to output without any extra tools.

Interpretable Models

Linear Regression

Each feature has a visible weight; the output is a weighted sum

Decision Trees

A series of if/then branching rules you can follow

Rule-Based Scoring

Explicit point systems like CHA₂DS₂-VASc stroke risk score

Trade-off: Interpretable models are transparent, but they often cannot capture the complex patterns in medical images that deep learning can. They work well for structured clinical data but struggle with raw image analysis.

Explainability

Explainability means you can provide a human-understandable justification for why a model made a specific prediction — even when the model itself is too complex to understand directly. These techniques are applied after the model has already made its prediction (called "post-hoc" methods).

The Black Box Problem

Medical Image
Deep Learning

"Black Box"

Prediction
XAI Tools

Post-hoc Explanation

This is essential for deep learning models like convolutional neural networks (CNNs) and transformers, which are widely used in medical imaging. These models can process thousands of image features simultaneously and often outperform simpler models — but their internal decision-making is opaque.

Comparison at a Glance

AspectInterpretabilityExplainability
What is it?The model itself is transparent — you can understand its internal mechanics directlyPost-hoc tools that justify a complex model's output in human terms
When?Built into the model from the startApplied after the model makes a prediction
Which models?Simple models: linear regression, decision trees, rule-based systemsComplex models: CNNs, deep neural networks, transformers
What it showsHow the decision was made — the exact path from input to outputWhy this decision was made — which features or regions influenced it
Trade-offHigh transparency, but often lower accuracy on complex tasksEnables use of powerful models, but explanations are approximations
Clinical analogy"Your risk score is 7 because of factors A, B, and C""The AI focused on this region of the scan when making its diagnosis"

Key Point: Interpretability and explainability are complementary, not competing. For medical imaging, deep learning models deliver the best diagnostic performance but operate as black boxes. Explainability methods bridge the gap between that performance and the clinical trust required to use them safely.

Explainability Methods (Post-Hoc Tools)

The methods below are all explainability techniques applied to complex models after a prediction has been made. None of them make the model itself interpretable — they provide external explanations of the model's behaviour.

Why Explainability and Interpretability Matter

Clinical Trust

Clinicians are trained to understand the reasoning behind a diagnosis. Research shows physicians are more likely to adopt AI tools when they can evaluate the reasoning behind the output.

Patient Safety

During COVID-19, XAI analysis revealed some models were basing predictions on text annotations or scanner markers rather than actual lung pathology — preventing flawed models from reaching patients.

Bias Detection

Explainability methods can reveal when a model's decisions are influenced by demographic features or scanner type rather than actual pathology, allowing correction before deployment.

Regulatory Requirements

The EU AI Act classifies medical AI as 'high-risk' requiring transparency. The FDA has authorized over 950 AI/ML devices and is developing frameworks for ongoing monitoring.

Scientific Discovery

XAI methods can reveal patterns AI finds in medical images, potentially identifying new imaging biomarkers and generating research hypotheses. AI becomes a scientific collaborator.

Informed Consent

Patients have a right to understand how decisions about their care are made. Explainability ensures healthcare providers can communicate the AI's role in meaningful terms.

Real-World Examples

Brain Tumour Detection

Modality: Brain MRIMethods: Grad-CAM, SHAP, LIME

CNN classifiers trained on the BRATS dataset achieve over 97% accuracy for tumour detection. Grad-CAM heatmaps confirm the model focuses on actual tumour locations. SHAP validates reliance on tissue intensity and lesion boundaries rather than artifacts.

COVID-19 Chest X-Ray

Cautionary Tale
Modality: Chest X-rayMethods: Grad-CAM, Saliency Maps

XAI analysis revealed that some highly-accurate COVID-19 classifiers were relying on text labels and hospital-specific characteristics rather than lung pathology. This became one of the most cited examples of why XAI is essential.

Parkinson's Diagnosis

Modality: DaTscan SPECTMethods: LRP

LRP applied to 3D CNNs processing DaTscan SPECT images shows the model focusing on striatal uptake patterns — the primary biomarker for Parkinson's disease — demonstrating clinically meaningful feature learning.

Bias Detection in Imaging

Modality: Musculoskeletal UltrasoundMethods: Explainability Analysis

Research examined the impact of sex bias on AI models for knee ultrasound interpretation. Explainability methods identified when demographic factors inappropriately influenced predictions.

Key Terms Glossary

Black BoxA model whose internal decision-making is too complex for humans to understand directly. Most deep learning models are black boxes.
CNNConvolutional Neural Network. A deep learning architecture designed for processing images, learning features from edges up to complex patterns.
Explainability (XAI)The ability to provide human-understandable justifications for a model's predictions, typically through post-hoc methods applied to complex models.
Grad-CAMGradient-weighted Class Activation Mapping. Generates heatmaps showing which image regions influenced a CNN's prediction.
HeatmapA colour-coded overlay showing the relative importance of different image regions to a model's prediction.
InterpretabilityThe degree to which a human can directly understand a model's internal mechanics without additional tools. A property of the model itself.
LIMELocal Interpretable Model-Agnostic Explanations. Explains predictions by perturbing inputs and building a local surrogate model.
LRPLayer-wise Relevance Propagation. Decomposes a prediction backward through the network to assign relevance to each input pixel.
Post-HocApplied after a model has made a prediction, rather than built into the model.
SHAPSHapley Additive exPlanations. Uses game theory to compute each feature's contribution to a prediction.
TransformerA deep learning architecture using self-attention mechanisms. Vision Transformers (ViTs) process images as sequences of patches.