Saturno, Edoardo
(2025)
3D reconstruction from uncalibrated collections of normal maps.
[Laurea magistrale], Università di Bologna, Corso di Studio in
Artificial intelligence [LM-DM270]
Documenti full-text disponibili:
![[thumbnail of Thesis]](https://amslaurea.unibo.it/style/images/fileicons/application_pdf.png) |
Documento PDF (Thesis)
Disponibile con Licenza: Salvo eventuali più ampie autorizzazioni dell'autore, la tesi può essere liberamente consultata e può essere effettuato il salvataggio e la stampa di una copia per fini strettamente personali di studio, di ricerca e di insegnamento, con espresso divieto di qualunque utilizzo direttamente o indirettamente commerciale. Ogni altro diritto sul materiale è riservato
Download (6MB)
|
Abstract
3D reconstruction is a fundamental task in computer vision that aims to generate accurate digital representations of real-world objects from 2D images. Traditional approaches, such as the DUSt3R method, rely on RGB images as input data, utilizing massive datasets to achieve significant results in reconstructing a wide variety of objects without the need for camera parameter information. Despite achieving excellent coverage of the original object’s surface, these models still struggle to capture fine-level details.
The proposed solution addresses this weakness by using a different kind of data for training the architecture. Normal maps are images that encode surface orientation vectors using colors, and this work explores their potential as an alternative to traditional RGB-based datasets. The goal is to use the additional geometric information contained in normal maps to improve the performance of the DUSt3R method, enhancing 3D reconstruction accuracy for small surface details. While synthetic data is used for training, evaluation is conducted using real data, specifically from the DiLiGenT-MV dataset.
Results indicate that when the model is trained with normal maps, the reconstruction accuracy of facial features, robe folds, and other small surface details improves while maintaining good shape coverage of the entire object. These findings highlight the potential to overcome the limitations of previous approaches by incorporating richer geometric cues during model training. At the same time, opportunities for further improvements remain, such as modifying the training objective by introducing normal-specific losses or using a combination of RGB and normal data for training.
Abstract
3D reconstruction is a fundamental task in computer vision that aims to generate accurate digital representations of real-world objects from 2D images. Traditional approaches, such as the DUSt3R method, rely on RGB images as input data, utilizing massive datasets to achieve significant results in reconstructing a wide variety of objects without the need for camera parameter information. Despite achieving excellent coverage of the original object’s surface, these models still struggle to capture fine-level details.
The proposed solution addresses this weakness by using a different kind of data for training the architecture. Normal maps are images that encode surface orientation vectors using colors, and this work explores their potential as an alternative to traditional RGB-based datasets. The goal is to use the additional geometric information contained in normal maps to improve the performance of the DUSt3R method, enhancing 3D reconstruction accuracy for small surface details. While synthetic data is used for training, evaluation is conducted using real data, specifically from the DiLiGenT-MV dataset.
Results indicate that when the model is trained with normal maps, the reconstruction accuracy of facial features, robe folds, and other small surface details improves while maintaining good shape coverage of the entire object. These findings highlight the potential to overcome the limitations of previous approaches by incorporating richer geometric cues during model training. At the same time, opportunities for further improvements remain, such as modifying the training objective by introducing normal-specific losses or using a combination of RGB and normal data for training.
Tipologia del documento
Tesi di laurea
(Laurea magistrale)
Autore della tesi
Saturno, Edoardo
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Computer Vision, 3D reconstruction, MVS, DUSt3R, normal maps
Data di discussione della Tesi
25 Marzo 2025
URI
Altri metadati
Tipologia del documento
Tesi di laurea
(NON SPECIFICATO)
Autore della tesi
Saturno, Edoardo
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Computer Vision, 3D reconstruction, MVS, DUSt3R, normal maps
Data di discussione della Tesi
25 Marzo 2025
URI
Statistica sui download
Gestione del documento: