DoGNeXt: Convolutional Neural Network with Difference of Gaussian kernels

Gardenal, Davide (2025) DoGNeXt: Convolutional Neural Network with Difference of Gaussian kernels. [Laurea magistrale], Università di Bologna, Corso di Studio in Artificial intelligence [LM-DM270]
Documenti full-text disponibili:
[thumbnail of Thesis] Documento PDF (Thesis)
Disponibile con Licenza: Salvo eventuali più ampie autorizzazioni dell'autore, la tesi può essere liberamente consultata e può essere effettuato il salvataggio e la stampa di una copia per fini strettamente personali di studio, di ricerca e di insegnamento, con espresso divieto di qualunque utilizzo direttamente o indirettamente commerciale. Ogni altro diritto sul materiale è riservato

Download (2MB)

Abstract

Convolutional Neural Networks have seen a lot of advancements over the years. All modern CNNs moved from using classical convolutions, that process the spatial and channel dimension simultaneously, to depth-wise separable convolutions, effectively separating the spatial aggregation with the channel mixing. This has enabled to get a better picture on how kernels are pro- cessing the images, in fact, recently it was discovered that many of them share a set of common patterns, resembling the Difference of Gaussian (DoG) function. This has led us into formulating a new DoG-parametrized convolutional kernel called DoGConv that aims at recreating the common kernel patterns observed in the wild. With this we created DoGNeXt, a CNN based on the ConvNeXt V2 architecture. We test DoGNeXt on ImageNet1K and MedMNIST achieving remarkable results, beating ConvNeXt V2 when the training data is scarce. Thanks to the DoGConv parametrization we are able to resize the kernels without retraining. This property is used to improve the performance of the network when used on images with small resolutions. In this setting DoGNeXt is able to perform remarkably well. Finally, we highlight a lack of augmentation for small images in the classic ImageNet1K training recipe. We propose an additional augmentation that is able to outperform, across a wide range of image resolutions, the vanilla training, performing on par with DoGNeXt kernel resizing technique.

Abstract
Tipologia del documento
Tesi di laurea (Laurea magistrale)
Autore della tesi
Gardenal, Davide
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Computer Vision, Deep Learning, CNN, DoGNeXt, ImageNet1K, Image Classification
Data di discussione della Tesi
25 Marzo 2025
URI

Altri metadati

Statistica sui download

Gestione del documento: Visualizza il documento

^