Enhancing a Text-Shape Coherence Metric by Leveraging Contrastive Losses and a Large-Scale Dataset

Mercanti, Davide (2024) Enhancing a Text-Shape Coherence Metric by Leveraging Contrastive Losses and a Large-Scale Dataset. [Laurea magistrale], Università di Bologna, Corso di Studio in Artificial intelligence [LM-DM270]

Salva citazione

Documenti full-text disponibili:

Documento PDF (Thesis)
Disponibile con Licenza: Creative Commons: Attribuzione - Non commerciale - Condividi allo stesso modo 4.0 (CC BY-NC-SA 4.0)
Download (7MB)

Abstract

In this work a coherence metric named CrossCoherence (CC), recently proposed, which is trained on chairs and tables only, will be analysed and extended to deal with (almost) any kind of shape. This metric works directly with point clouds to get rid of the dependence from rendering processes and leverages cross-attention to target the coherence between text and shape. This analysis will show that (a) the application of contrastive training losses can enhance CC results and that (b) increasing data volume and diversity significantly enhance generalization capabilities of the model and quality of the results –– as it is often the case with deep architectures –– and enables CC to be used as a general coherence metric.

Abstract

Tipologia del documento

Tesi di laurea (Laurea magistrale)

Autore della tesi

Mercanti, Davide

Relatore della tesi

Salti, Samuele

Correlatore della tesi

Amaduzzi, Andrea

Scuola

Ingegneria e Architettura

Corso di studio