Cecconi, Leonardo
(2017)
Optimal Tiling Strategy for Memory
Bandwidth Reduction for CNNs.
[Laurea magistrale], Università di Bologna, Corso di Studio in
Ingegneria elettronica [LM-DM270], Documento full-text non disponibile
Il full-text non è disponibile per scelta dell'autore.
(
Contatta l'autore)
Abstract
Convolutional Neural Networks (CNNs), are nowadays present
in many different embedded solutions. One of the biggest problems related to their execution is the memory bottleneck. In this work we propose an optimal double buffering tiling strategy, to reduce the memory bandwidth in the execution of deep CNN architecture, testing our model on one of the two cores of a Zynq-7020 embedded platform.
An optimal tiling strategy is found for each layer of the network, optimizing for lowest external memory to/from On-Chip memory bandwidth. Performance test results show an improvement in the total execution time of 50% (cache disabled / 34% cache enabled), compared to a non double buffered implementation. Moreover, a 5x lower external memory to/from On-Chip memory
double buffering memory bandwidth is achieved, with respect to naive tiling settings. Furthermore it is shown that tiling settings for highest OCM usage do not generally lead to the lowest bandwidth scenario.
Abstract
Convolutional Neural Networks (CNNs), are nowadays present
in many different embedded solutions. One of the biggest problems related to their execution is the memory bottleneck. In this work we propose an optimal double buffering tiling strategy, to reduce the memory bandwidth in the execution of deep CNN architecture, testing our model on one of the two cores of a Zynq-7020 embedded platform.
An optimal tiling strategy is found for each layer of the network, optimizing for lowest external memory to/from On-Chip memory bandwidth. Performance test results show an improvement in the total execution time of 50% (cache disabled / 34% cache enabled), compared to a non double buffered implementation. Moreover, a 5x lower external memory to/from On-Chip memory
double buffering memory bandwidth is achieved, with respect to naive tiling settings. Furthermore it is shown that tiling settings for highest OCM usage do not generally lead to the lowest bandwidth scenario.
Tipologia del documento
Tesi di laurea
(Laurea magistrale)
Autore della tesi
Cecconi, Leonardo
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Tiling strategy,dma,cnn,neural networks,memory bandwidth,double buffering
Data di discussione della Tesi
25 Luglio 2017
URI
Altri metadati
Tipologia del documento
Tesi di laurea
(NON SPECIFICATO)
Autore della tesi
Cecconi, Leonardo
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Tiling strategy,dma,cnn,neural networks,memory bandwidth,double buffering
Data di discussione della Tesi
25 Luglio 2017
URI
Gestione del documento: