Deploying deep learning for 3D reconstruction from monocular video sequences

Bartoli, Simone (2021) Deploying deep learning for 3D reconstruction from monocular video sequences. [Laurea magistrale], Università di Bologna, Corso di Studio in Ingegneria informatica [LM-DM270], Documento ad accesso riservato.
Documenti full-text disponibili:
[thumbnail of Thesis] Documento PDF (Thesis)
Full-text accessibile solo agli utenti istituzionali dell'Ateneo
Disponibile con Licenza: Salvo eventuali più ampie autorizzazioni dell'autore, la tesi può essere liberamente consultata e può essere effettuato il salvataggio e la stampa di una copia per fini strettamente personali di studio, di ricerca e di insegnamento, con espresso divieto di qualunque utilizzo direttamente o indirettamente commerciale. Ogni altro diritto sul materiale è riservato

Download (31MB) | Contatta l'autore

Abstract

3D reconstruction from monocular video sequences is a field of increasingly interest in the late years. Before the growth of deep learning, the retrieve of depth information from single images was possible only with RGBD sensors or algorithmic approaches. However, the availability of more and more data has allowed the training of monocular depth estimation neural networks, introducing innovative data-driven techniques. Since recovering ground-truth labels for depth estimation is very challenging, most of the research has focused on unsupervised or semi-supervised training approaches. The currently state of the art for 3D reconstruction is defined by an algorithmic method which exploits a Structure from Motion and Multi-View Stereo pipeline. Nevertheless, the whole approach is based on keypoints extraction, which provides well-known limitations when it comes to texture-less, reflective and/or transparent surfaces. Consequentely, a possible way to predict dense depth maps even in absence of keypoints is by employing neural networks. This work proposes a novel data-driven pipeline for 3D reconstruction from monocular video sequences. It exploits a fine-tuning technique to adjust the weights of a pre-trained depth estimation neural network depending on the input scene. In doing so, the network can learn the features of a particular object and can provide semi real-time depth predictions for 3D reconstruction. Furthermore, the project provides a comparison with a custom implementation of the current state of the art approach and shows the potential of this innovative data-driven pipeline.

Abstract
Tipologia del documento
Tesi di laurea (Laurea magistrale)
Autore della tesi
Bartoli, Simone
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Deep Learning,Computer Vision,Neural Network,Depth Estimation,Colmap,Consistent Video Depth Estimation,3D Reconstruction,Truncated Signed Distance Function,OpenMVS,Structure from Motion,Multi View Stereo,MiDaS2,FlowNet2,Depth Maps
Data di discussione della Tesi
11 Marzo 2021
URI

Altri metadati

Statistica sui download

Gestione del documento: Visualizza il documento

^