Alessandroni, Filippo
(2026)
Dynamical Evolution of a Neural Network in the Online Learning Regime and the Effect of Low-Rank Adaptation on Catastrophic Forgetting.
[Laurea magistrale], Università di Bologna, Corso di Studio in
Matematica [LM-DM270]
Documenti full-text disponibili:
Abstract
Modern neural architectures are rarely trained from scratch for each new task; instead, they leverage knowledge acquired from previously learned tasks. This paradigm, known as transfer learning, consists in initializing a model for a target task using a network previously trained on a related source task. If we also require retaining performance on the source task, the problem falls within the framework of continual learning. In this setting, forgetting quantifies the degradation in performance on the source task after training on the target task. A well-known challenge is catastrophic forgetting: during fine-tuning, performance on the original task can deteriorate dramatically. A central question is whether specific fine-tuning strategies can mitigate catastrophic forgetting. In this work, we investigate the impact of Low-Rank Adaptation (LoRA) on catastrophic forgetting. LoRA is a parameter-efficient fine-tuning method in which the pretrained weight matrix is kept frozen and updated by adding a low-rank perturbation. Since the original weights remain unchanged, this approach may have significant implications for memory retention. We address this question within the theoretical framework of online learning, which enables the study of learning dynamics by tracking the time evolution of macroscopic order parameters (overlaps). In online learning, the weights of a network are updated after each individual data point. This perspective allows us to analyze neural networks as dynamical systems. Focusing on committee machines, we derive differential equations for the learning dynamics and study their equilibrium configurations, plateau phases, and generalization curves. Through this analytical approach, we identify the conditions under which LoRA mitigates catastrophic forgetting, highlighting the role of source–target similarity, architectural choices for teacher and student networks, and the interplay between the hyperparameters of LoRA.
Abstract
Modern neural architectures are rarely trained from scratch for each new task; instead, they leverage knowledge acquired from previously learned tasks. This paradigm, known as transfer learning, consists in initializing a model for a target task using a network previously trained on a related source task. If we also require retaining performance on the source task, the problem falls within the framework of continual learning. In this setting, forgetting quantifies the degradation in performance on the source task after training on the target task. A well-known challenge is catastrophic forgetting: during fine-tuning, performance on the original task can deteriorate dramatically. A central question is whether specific fine-tuning strategies can mitigate catastrophic forgetting. In this work, we investigate the impact of Low-Rank Adaptation (LoRA) on catastrophic forgetting. LoRA is a parameter-efficient fine-tuning method in which the pretrained weight matrix is kept frozen and updated by adding a low-rank perturbation. Since the original weights remain unchanged, this approach may have significant implications for memory retention. We address this question within the theoretical framework of online learning, which enables the study of learning dynamics by tracking the time evolution of macroscopic order parameters (overlaps). In online learning, the weights of a network are updated after each individual data point. This perspective allows us to analyze neural networks as dynamical systems. Focusing on committee machines, we derive differential equations for the learning dynamics and study their equilibrium configurations, plateau phases, and generalization curves. Through this analytical approach, we identify the conditions under which LoRA mitigates catastrophic forgetting, highlighting the role of source–target similarity, architectural choices for teacher and student networks, and the interplay between the hyperparameters of LoRA.
Tipologia del documento
Tesi di laurea
(Laurea magistrale)
Autore della tesi
Alessandroni, Filippo
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Indirizzo
CURRICULUM ADVANCED MATHEMATICS FOR APPLICATIONS
Ordinamento Cds
DM270
Parole chiave
Machine Learning,Statistical Mechanics,Low-Rank Adaptation,Catastrophic Forgetting,Online Learning,Transfer Learning,Continual Learning,Fine-Tuning
Data di discussione della Tesi
27 Marzo 2026
URI
Altri metadati
Tipologia del documento
Tesi di laurea
(NON SPECIFICATO)
Autore della tesi
Alessandroni, Filippo
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Indirizzo
CURRICULUM ADVANCED MATHEMATICS FOR APPLICATIONS
Ordinamento Cds
DM270
Parole chiave
Machine Learning,Statistical Mechanics,Low-Rank Adaptation,Catastrophic Forgetting,Online Learning,Transfer Learning,Continual Learning,Fine-Tuning
Data di discussione della Tesi
27 Marzo 2026
URI
Statistica sui download
Gestione del documento: