Enhancing a Text-Shape Coherence Metric by Leveraging Contrastive Losses and a Large-Scale Dataset

Mercanti, Davide (2024) Enhancing a Text-Shape Coherence Metric by Leveraging Contrastive Losses and a Large-Scale Dataset. [Laurea magistrale], Università di Bologna, Corso di Studio in Artificial intelligence [LM-DM270]
Documenti full-text disponibili:
[img] Documento PDF (Thesis)
Disponibile con Licenza: Creative Commons: Attribuzione - Non commerciale - Condividi allo stesso modo 4.0 (CC BY-NC-SA 4.0)

Download (7MB)

Abstract

In this work a coherence metric named CrossCoherence (CC), recently proposed, which is trained on chairs and tables only, will be analysed and extended to deal with (almost) any kind of shape. This metric works directly with point clouds to get rid of the dependence from rendering processes and leverages cross-attention to target the coherence between text and shape. This analysis will show that (a) the application of contrastive training losses can enhance CC results and that (b) increasing data volume and diversity significantly enhance generalization capabilities of the model and quality of the results –– as it is often the case with deep architectures –– and enables CC to be used as a general coherence metric.

Abstract
Tipologia del documento
Tesi di laurea (Laurea magistrale)
Autore della tesi
Mercanti, Davide
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
computer vision,text-shape coherence,coherence,contrastive learning,deep learning
Data di discussione della Tesi
19 Marzo 2024
URI

Altri metadati

Statistica sui download

Gestione del documento: Visualizza il documento

^