Representation Learning for Implicit Neural Representation: a Comparative Analysis

Curti, Iacopo (2023) Representation Learning for Implicit Neural Representation: a Comparative Analysis. [Laurea magistrale], Università di Bologna, Corso di Studio in Automation engineering / ingegneria dell’automazione [LM-DM270], Documento full-text non disponibile

Salva citazione

Il full-text non è disponibile per scelta dell'autore. (Contatta l'autore)

Abstract

Implicit Neural Representation is nowadays a deeply discussed topic. INR is a neural network trained to produce the appropriate measurement value for any input spatial location. In Deep Learning the measurement is usually represented on a discrete grid, e.g. 2D grid of pixels. In contrast, the Implicit representation is encoding a continuous function (RGB values in images, Unsigned Distance Function in 3D point clouds), whose inputs are spatial coordinates. Thanks to the spreading of this notion, the idea of taking advantage of INR to perform deep learning tasks has started to be investigated recently. The main problem, regarding INRs, is that a single representation, since it is produced using a network (MLP), is based on thousands of parameters and it is not clear how to use this data as input to a new deep learning architecture. Therefore, to solve this problem many works try to find a way to encode, in an efficient way, the continuous datum content. Consequently, two main schools of thoughts to encode INRs have been developed. One of them is using Individual INRs (inr2vec), produced by MLPs, as input to an embedder that can produce a compact latent representation of the continuous function, based only on the weights of the networks in input. The other way (Functa) is based on the development of a shared base network, which aims to create compressed INRs starting from discrete data. Then, these new latent codes can be used as input to neural networks. This thesis’s main purpose is to compare the two approaches. The comparison has been performed in terms of reconstruction quality and semantics of the produced latent codes. The dataset under examination is ModelNet40, therefore the data type in input to the two networks is a 3D point cloud. To complete the test, a simple classifier, trained on the latent codes produced by the two methods, has been developed, to evaluate the ability of these representations to encode the class information.

Abstract

Tipologia del documento

Tesi di laurea (Laurea magistrale)

Autore della tesi

Curti, Iacopo

Relatore della tesi

Di Stefano, Luigi

Correlatore della tesi

Cardace, Adriano ; De Luigi, Luca

Scuola

Ingegneria e Architettura

Corso di studio