Diaconu, Călin
(2024)
Latent Replay-Based On-Device Continual Learning using Transformers on Edge Ultra-Low-Power IoT Platforms.
[Laurea magistrale], Università di Bologna, Corso di Studio in
Artificial intelligence [LM-DM270]
Documenti full-text disponibili:
Abstract
Transformers have recently exploded in popularity thanks to their adoption through tools like ChatGPT. This leads to an ever-growing pressure on computational centers, that are economically and ecologically expensive. Furthermore, sending private data from the end-user to these centers raises major security and privacy issues.
Among the possible solutions to this is continual learning (CL) on embedded devices, which enables lighter and faster retraining procedures, and eases the deployment requirements on low-power platforms.
This work explores this alternative by applying CL methods such as Latent Replay, Copy Weight with Reinit* (CWR*), and Architectural and Regularization 1* (AR1*) on transformer
architectures designed for image processing, like the Vision Transformer (ViT).
The thesis opens the way for efficient deployment of transformer architectures on PULP microcontrollers, by implementing a highly flexible ViT golden model test in TrainLib. On the CORe50 dataset, accuracies improve for the evaluated configurations by up to 18% and, by using a ViT setup with less transformer blocks, models become up to 40% lighter, at the expense of less than 6% in accuracy.
Abstract
Transformers have recently exploded in popularity thanks to their adoption through tools like ChatGPT. This leads to an ever-growing pressure on computational centers, that are economically and ecologically expensive. Furthermore, sending private data from the end-user to these centers raises major security and privacy issues.
Among the possible solutions to this is continual learning (CL) on embedded devices, which enables lighter and faster retraining procedures, and eases the deployment requirements on low-power platforms.
This work explores this alternative by applying CL methods such as Latent Replay, Copy Weight with Reinit* (CWR*), and Architectural and Regularization 1* (AR1*) on transformer
architectures designed for image processing, like the Vision Transformer (ViT).
The thesis opens the way for efficient deployment of transformer architectures on PULP microcontrollers, by implementing a highly flexible ViT golden model test in TrainLib. On the CORe50 dataset, accuracies improve for the evaluated configurations by up to 18% and, by using a ViT setup with less transformer blocks, models become up to 40% lighter, at the expense of less than 6% in accuracy.
Tipologia del documento
Tesi di laurea
(Laurea magistrale)
Autore della tesi
Diaconu, Călin
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
artificial neural network,Continual Learning,self-attention,image classification,embedded system,microcontrollers,Visual Transformer,Transformers,rehearsal methods
Data di discussione della Tesi
5 Dicembre 2024
URI
Altri metadati
Tipologia del documento
Tesi di laurea
(NON SPECIFICATO)
Autore della tesi
Diaconu, Călin
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
artificial neural network,Continual Learning,self-attention,image classification,embedded system,microcontrollers,Visual Transformer,Transformers,rehearsal methods
Data di discussione della Tesi
5 Dicembre 2024
URI
Statistica sui download
Gestione del documento: