Vorabbi, Sara
(2025)
Conceptograms generation for visual transformers through intermediate activation analysis.
[Laurea magistrale], Università di Bologna, Corso di Studio in
Artificial intelligence [LM-DM270], Documento full-text non disponibile
Il full-text non è disponibile per scelta dell'autore.
(
Contatta l'autore)
Abstract
The increasing prevalence of artificial intelligence techniques, particularly deep learning, has led to the emergence of a new research field focused on explaining the behavior of deep learning models in terms of interpretability, Explainable AI. This thesis extends the corevector and peephole framework, originally designed for convolutional neural networks, to analyze feature representations in Visual Transformer (ViT) within the context of image classification. By leveraging Singular Value Decomposition (SVD), low-dimensional corevectors are extracted from intermediate activations, enabling a structured interpretation of learned features. The framework is further enhanced by integrating clustering techniques through the Gaussian Mixture Model (GMM) and the Empirical Posterior (EP), establishing probabilistic relationships between low-level feature clusters and output classes, the peepholes.
A central contribution of this work is the introduction of the conceptogram, a visual tool that tracks the evolution of concepts across layers. By utilizing heatmaps of peepholes, the conceptogram reveals the hierarchical organization of features emerging from early layers, providing deeper insights into the inner workings of ViTs. These findings improve the interpretability of attention-based models and offer novel validation tools for assessing their feature-learning dynamics.
Abstract
The increasing prevalence of artificial intelligence techniques, particularly deep learning, has led to the emergence of a new research field focused on explaining the behavior of deep learning models in terms of interpretability, Explainable AI. This thesis extends the corevector and peephole framework, originally designed for convolutional neural networks, to analyze feature representations in Visual Transformer (ViT) within the context of image classification. By leveraging Singular Value Decomposition (SVD), low-dimensional corevectors are extracted from intermediate activations, enabling a structured interpretation of learned features. The framework is further enhanced by integrating clustering techniques through the Gaussian Mixture Model (GMM) and the Empirical Posterior (EP), establishing probabilistic relationships between low-level feature clusters and output classes, the peepholes.
A central contribution of this work is the introduction of the conceptogram, a visual tool that tracks the evolution of concepts across layers. By utilizing heatmaps of peepholes, the conceptogram reveals the hierarchical organization of features emerging from early layers, providing deeper insights into the inner workings of ViTs. These findings improve the interpretability of attention-based models and offer novel validation tools for assessing their feature-learning dynamics.
Tipologia del documento
Tesi di laurea
(Laurea magistrale)
Autore della tesi
Vorabbi, Sara
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Transformer, Attention, Explainable AI, Intermidiate Activation Analysis
Data di discussione della Tesi
25 Marzo 2025
URI
Altri metadati
Tipologia del documento
Tesi di laurea
(NON SPECIFICATO)
Autore della tesi
Vorabbi, Sara
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Transformer, Attention, Explainable AI, Intermidiate Activation Analysis
Data di discussione della Tesi
25 Marzo 2025
URI
Gestione del documento: