Gandolfi, Riccardo
Design of a memory-to-memory tensor reshuffle unit for ultra-low-power deep learning accelerators.
[Laurea magistrale], Università di Bologna, Corso di Studio in
Ingegneria elettronica [LM-DM270]
Documenti full-text disponibili:
![[thumbnail of Thesis]]( |
Documento PDF (Thesis)
Disponibile con Licenza: Salvo eventuali più ampie autorizzazioni dell'autore, la tesi può essere liberamente consultata e può essere effettuato il salvataggio e la stampa di una copia per fini strettamente personali di studio, di ricerca e di insegnamento, con espresso divieto di qualunque utilizzo direttamente o indirettamente commerciale. Ogni altro diritto sul materiale è riservato
Download (2MB)
In the context of IoT edge-processing, deep learning applications and near-sensor analytics, the constraints on having low area occupation and low power consumption in MCUs (Microcontroller Units) performing computationally intensive tasks are more stringent than ever. A promising direction is to develop HWPEs (Hardware Processing Engines) that support and help the end-node in the execution of these tasks. The following work concerns the design and testing of the Datamover, a small and easily configurable HWPE for tensor shuffling and data marshaling operation. The accelerator is to be integrated within the Darkside PULP chip and can perform reordering operations and transpositions on data with different sub-byte widths. The focus is on the design of the internal buffering and transposition mechanism and its performance when compared to a software on-platform execution. Also, synthesis results will be shown in terms of area occupation and timing.
In the context of IoT edge-processing, deep learning applications and near-sensor analytics, the constraints on having low area occupation and low power consumption in MCUs (Microcontroller Units) performing computationally intensive tasks are more stringent than ever. A promising direction is to develop HWPEs (Hardware Processing Engines) that support and help the end-node in the execution of these tasks. The following work concerns the design and testing of the Datamover, a small and easily configurable HWPE for tensor shuffling and data marshaling operation. The accelerator is to be integrated within the Darkside PULP chip and can perform reordering operations and transpositions on data with different sub-byte widths. The focus is on the design of the internal buffering and transposition mechanism and its performance when compared to a software on-platform execution. Also, synthesis results will be shown in terms of area occupation and timing.
Tipologia del documento
Tesi di laurea
(Laurea magistrale)
Autore della tesi
Gandolfi, Riccardo
Relatore della tesi
Correlatore della tesi
Corso di studio
Ordinamento Cds
Parole chiave
Iot Edge Processing,Near-sensor Analytics,MCU,PULP,Data Marshaling,Deep Learning
Data di discussione della Tesi
20 Luglio 2021
Altri metadati
Tipologia del documento
Tesi di laurea
Autore della tesi
Gandolfi, Riccardo
Relatore della tesi
Correlatore della tesi
Corso di studio
Ordinamento Cds
Parole chiave
Iot Edge Processing,Near-sensor Analytics,MCU,PULP,Data Marshaling,Deep Learning
Data di discussione della Tesi
20 Luglio 2021
Statistica sui download
Gestione del documento: