MCPNet++: An Interpretable Classifier via Multi-Level Concept Prototypes

Post-hoc and inherently interpretable methods have shown great success in uncovering the inner workings of black-box models, whether by examining them after training or by explicitly designing for interpretability. While these approaches effectively narrow the semantic gap between a model's latent space and human understanding, they typically extract only high-level semantics from the model's final feature map. As a result, they provide a limited perspective on the decision-making process. We argue that explanations lacking insight into both lower- and mid-level semantics cannot be considered fully faithful or genuinely useful. To address this issue, we introduce the Multi-Level Concept Prototypes Classifier (MCPNet), which offers a more holistic interpretation by drawing on information from multiple levels within the model. Rather than relying on predefined concept labels, MCPNet autonomously discovers meaningful concepts from feature maps. To increase versatility, we further propose MCPNet++, which can be seamlessly applied to both CNN and transformer backbones, allowing it to learn meaningful concepts from their respective features. Building on these learned concepts, we also introduce an LLM-based method to bridge the gap between these concepts and human perception. Experimental results show that MCPNet++ provides more comprehensive explanations without sacrificing model performance, with the discovered concepts aligning closely with human understanding.

	MCPNet	MCPNet++	ProtoPNet	ProtoPFormer	BotCL	Concept Bottleneck Model	VCC	CRAFT***	TCAV
Explanation Type	Inherently	Inherently	Inherently	Inherently	Inherently	Inherently	Post-hoc	Post-hoc	Post-hoc
Explanation Scale	Multi-level	Multi-level	Single-level	Single-level	Single-level	Single-level	Multi-level	Single-level	Single-level
w/o Concept Labels	✓	✓	✓	✓	✓	✗	✓	✓	✗**
Available for CNN and Transformers	✓*	✓	✓*	✗	✓*	✓*	✓	✓*	✓*

* Applicable to CNN backbones only (not transformers) in original work. ** TCAV requires user-defined concept examples. *** CRAFT is a post-hoc method applied after training.

MCPNet++ extracts multi-level concept features from different layers, enforces diverse and class-consistent concepts, and aggregates them for interpretable classification.

Leveraging the CKA similarity metric, the CKA loss reduces similarity between concept segments within the same layer, encouraging diverse representations at a shared level of abstraction. Although it does not explicitly control what each segment learns, it serves as a constraint that discourages redundancy. This promotes multi-perspective representations, offering a more comprehensive basis for interpretation.

Motivated by the intuition that images from the same class tend to share similar concept compositions, the CCC loss encourages MCP features within the same class to be more similar while separating those from different classes. Implemented with a contrastive learning objective, it helps organize the concept space according to class-level patterns. This improves the consistency of learned representations and provides a stronger basis for classification and interpretation.

We introduce layer-wise dropout to reduce over-reliance on MCP features from any single layer during training. By randomly dropping the features from one layer at a time, the model is encouraged to utilize information from multiple semantic levels rather than depending only on the most discriminative one. This helps the classifier learn more balanced representations across low-, mid-, and high-level concepts.

To bridge the gap between learned concepts and human understanding, we introduce a captioning workflow using large language models (LLMs). For each concept, we collect high-response images and their corresponding activation patches, which implicitly represent its semantic meaning. These visual cues are then provided to the LLM to generate concise descriptions of the underlying concept. Instead of assigning a single fixed label, the model outputs multiple candidate descriptions, offering a more flexible and human-friendly interpretation of learned concepts.

MCPNet++ achieves competitive accuracy across both CNN (ResNet50) and transformer (DeiT-B-16) backbones while providing multi-level explanations. Compared to existing interpretable methods, it maintains strong performance across datasets, demonstrating that richer multi-level representations can be learned without degrading classification accuracy.

MCPNet employs multi-scale concept explanations as the foundation for accurate classification. In particular, the high responses to both Grizzly Bear and buffalo classes in terms of high-level concept would lead to confusion if the classification is based solely on the high-level responses, while such confusion can be resolved with the incorporation of low-level concept responses. Moreover, even without a direct concept match in the image, MCPNet accurately interprets the image using the constructed MCP distribution based on the holistic consideration over concept responses across multiple scales.

We present the concept response difference between the original and edited images to illustrate how visual changes are reflected in the learned concept representations. Hover over each image to highlight its corresponding concept responses.

To evaluate counterfactual reasoning, we compare concept responses between an original misclassified image and its counterfactually edited counterpart. Hover over each image to highlight its corresponding concept responses.

BibTeX

@ARTICLE{wang2026MCPNetPP,
  author  = {Wang, Bor-Shiun and Wang, Chien-Yi and Chiu, Wei-Chen},
  journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
  title   = {MCPNet++: Interpretable Classification Models via Multi-Level Concept Prototypes},
  year    = {2026},
  pages   = {1--18},
  doi     = {10.1109/TPAMI.2026.3680506}
}

MCPNet++: An Interpretable Classifier via Multi-Level Concept Prototypes

Abstract

Methods

Overview

Center Kernel Alignment (CKA) Loss

Contrastive Class-wise Concept (CCC) Loss

Layer-wise Dropout

Concept Captions

Experiments

Main Quantitative Results

Explanation Samples

Relation between Caption and Visualization

Counterfactual Result

BibTeX