Dynamical Evolution of a Neural Network in the Online Learning Regime and the Effect of Low-Rank Adaptation on Catastrophic Forgetting.

Alessandroni, Filippo (2026) Dynamical Evolution of a Neural Network in the Online Learning Regime and the Effect of Low-Rank Adaptation on Catastrophic Forgetting. [Laurea magistrale], Università di Bologna, Corso di Studio in Matematica [LM-DM270]

Salva citazione

Documenti full-text disponibili:

Documento PDF (Thesis)
Disponibile con Licenza: Creative Commons: Attribuzione - Non commerciale - Non opere derivate 4.0 (CC BY-NC-ND 4.0)
Download (4MB)

Abstract

Modern neural architectures are rarely trained from scratch for each new task; instead, they leverage knowledge acquired from previously learned tasks. This paradigm, known as transfer learning, consists in initializing a model for a target task using a network previously trained on a related source task. If we also require retaining performance on the source task, the problem falls within the framework of continual learning. In this setting, forgetting quantifies the degradation in performance on the source task after training on the target task. A well-known challenge is catastrophic forgetting: during fine-tuning, performance on the original task can deteriorate dramatically. A central question is whether specific fine-tuning strategies can mitigate catastrophic forgetting. In this work, we investigate the impact of Low-Rank Adaptation (LoRA) on catastrophic forgetting. LoRA is a parameter-efficient fine-tuning method in which the pretrained weight matrix is kept frozen and updated by adding a low-rank perturbation. Since the original weights remain unchanged, this approach may have significant implications for memory retention. We address this question within the theoretical framework of online learning, which enables the study of learning dynamics by tracking the time evolution of macroscopic order parameters (overlaps). In online learning, the weights of a network are updated after each individual data point. This perspective allows us to analyze neural networks as dynamical systems. Focusing on committee machines, we derive differential equations for the learning dynamics and study their equilibrium configurations, plateau phases, and generalization curves. Through this analytical approach, we identify the conditions under which LoRA mitigates catastrophic forgetting, highlighting the role of source–target similarity, architectural choices for teacher and student networks, and the interplay between the hyperparameters of LoRA.

Abstract

Tipologia del documento

Tesi di laurea (Laurea magistrale)

Autore della tesi

Alessandroni, Filippo

Relatore della tesi

Gerace, Federica

Correlatore della tesi

Marchetta, Théo ; Breccia, Alessandro

Scuola

Scienze

Corso di studio