Abstract

Surgical phase recognition is central to workflow analysis, enabling applications such as monitoring, skill assessment, and process optimization. However, the underlying deep models are often black boxes, limiting interpretability and trust. SurgX is a concept-driven explanation framework that associates neurons with human-interpretable surgical concepts. We construct cholecystectomy-specific concept sets, identify representative neuron activation sequences, and annotate neurons with concepts. Evaluated on TeCNO and Causal ASFormer using Cholec80, SurgX yields meaningful explanations and improves transparency in surgical AI.

Main Contributions

Introduce SurgX, a concept-based explanation framework for surgical phase recognition.
Develop cholecystectomy-tailored concept sets and conduct ablation studies on best practices for concept selection, including how to choose representative frames and representative sequences for neuron–concept annotation.
Validate SurgX on Causal ASFormer and TeCNO, showing consistent concept–neuron associations that enhance interpretability.

SurgX STEP 1. Neuron–Concept Annotation

The pipeline for annotating concepts to neurons proceeds in three stages:

A. Neuron Representative Sequence Selection

B. Concept Set Selection

ChoLec-270

C. Neuron–Concept Association

Details of each stage are provided below.

A. Neuron Representative Sequence Selection

Given a trained temporal phase recognizer (e.g., Causal ASFormer or TeCNO), we first select frames that yield high activations in the penultimate layer. Because temporal models respond to sequences rather than single frames, we extend each selected frame with its preceding frames to form a representative sequence. Ablation studies are summarized below.

Ablation Study: Frame Selection

Concept Alignment: Cosine similarity between key neuron’s concepts and the groundtruth class
Prediction Interpretability: Cosine similarity between key neuron’s concepts and predicted class

Ablation Study: Sequence Length

B. Concept Set Selection

Appropriate concept coverage is critical: if a neuron’s behavior is not representable by the concept set, reliable annotation is impossible. We therefore construct three cholecystectomy-related concept sets and compare them empirically.

Ablation Study: Concept Sets

C. Neuron–Concept Association

Using the selected sequences and concept set, we compute cosine similarity in a surgical VLM space (e.g., SurgVLP, PeskaVLP) between each neuron’s representative sequence and each concept text, and assign to each neuron the concepts with highest similarity.

SurgX STEP 2. Model's Prediction Explanation

The explanation process at test time proceeds as follows:

Input surgical frames are processed by the phase recognition model.
Neuron contributions to the final prediction are computed using the contribution formula.
Important neurons (e.g., Neuron 2, Neuron 128) are identified, and their annotated concepts (“extract the gallbladder into the bag”, “specimen bag”, “gallbladder”) explain the model’s decision.

Qualitative Results – Explanation Examples

0/0 (0:00/0:00)

✨MICCAI 2025✨ SurgX: Neuron-Concept Association for Explainable Surgical Phase Recognition