An experimental study on connecting Neural Radiance Fields to images and text

Torzi, Luca (2025) An experimental study on connecting Neural Radiance Fields to images and text. [Laurea magistrale], Università di Bologna, Corso di Studio in Artificial intelligence [LM-DM270]
Documenti full-text disponibili:
[thumbnail of Thesis] Documento PDF (Thesis)
Disponibile con Licenza: Creative Commons: Attribuzione - Condividi allo stesso modo 4.0 (CC BY-SA 4.0)

Download (2MB)

Abstract

Being able to process 3D data is an important skill of a deep learning architecture, because it can unlock deeper understanding of the world around us, but it is a complex goal to achieve due to the fragmentation of the methods employed to store and save 3D data, that leads to the development of different techniques specialized on each of them. In the last years, the computer vision research field focused on learning 3D data implicitly, such that a neural network could learn a continuous function that describes the properties of the object of interest: this representation is called Implicit Neural Representation or Neural Field. Between them, NeRF (Neural Radiance Field) was one of the most promising methodologies for learning functions representing 3D objects and scenes. Consequently, recent work proposed methodologies to process neural network weights directly, in order to generate a compact embedding that can be used to perform deep learning tasks efficiently. On top of this, how to link this embedding space to spaces embedding images and texts has been explored. The goal of this thesis is to expand the analysis performed in the latter, investigating the use of contrastive losses during the training phase. Moreover, we study which embedding space (the NeRF one or the image/text one) is more effective to perform retrieval of NeRFs. Finally, a fine-tuning of the NeRF embedding model is accomplished to explore the behavior of the whole architecture after receiving influence from the joint image/text embeddings during training.

Abstract
Tipologia del documento
Tesi di laurea (Laurea magistrale)
Autore della tesi
Torzi, Luca
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Neural Radiance Field,NeRF,Implicit Neural Representation,Multi-Modal Model,CLIP,Computer Vision,Deep Learning,Artificial Intelligence
Data di discussione della Tesi
7 Febbraio 2025
URI

Altri metadati

Statistica sui download

Gestione del documento: Visualizza il documento

^