DoGNeXt: Convolutional Neural Network with Difference of Gaussian kernels

Gardenal, Davide (2025) DoGNeXt: Convolutional Neural Network with Difference of Gaussian kernels. [Laurea magistrale], Università di Bologna, Corso di Studio in Artificial intelligence [LM-DM270]

Salva citazione

Documenti full-text disponibili:

Documento PDF (Thesis)
Disponibile con Licenza: Salvo eventuali più ampie autorizzazioni dell'autore, la tesi può essere liberamente consultata e può essere effettuato il salvataggio e la stampa di una copia per fini strettamente personali di studio, di ricerca e di insegnamento, con espresso divieto di qualunque utilizzo direttamente o indirettamente commerciale. Ogni altro diritto sul materiale è riservato
Download (2MB)

Abstract

Convolutional Neural Networks have seen a lot of advancements over the years. All modern CNNs moved from using classical convolutions, that process the spatial and channel dimension simultaneously, to depth-wise separable convolutions, effectively separating the spatial aggregation with the channel mixing. This has enabled to get a better picture on how kernels are pro- cessing the images, in fact, recently it was discovered that many of them share a set of common patterns, resembling the Difference of Gaussian (DoG) function. This has led us into formulating a new DoG-parametrized convolutional kernel called DoGConv that aims at recreating the common kernel patterns observed in the wild. With this we created DoGNeXt, a CNN based on the ConvNeXt V2 architecture. We test DoGNeXt on ImageNet1K and MedMNIST achieving remarkable results, beating ConvNeXt V2 when the training data is scarce. Thanks to the DoGConv parametrization we are able to resize the kernels without retraining. This property is used to improve the performance of the network when used on images with small resolutions. In this setting DoGNeXt is able to perform remarkably well. Finally, we highlight a lack of augmentation for small images in the classic ImageNet1K training recipe. We propose an additional augmentation that is able to outperform, across a wide range of image resolutions, the vanilla training, performing on par with DoGNeXt kernel resizing technique.

Abstract

Tipologia del documento

Tesi di laurea (Laurea magistrale)

Autore della tesi

Gardenal, Davide

Relatore della tesi

Salti, Samuele

Correlatore della tesi

Sugimoto, Akihiro

Scuola

Ingegneria e Architettura

Corso di studio