Conti, Matteo
(2024)
A Comprehensive Benchmark of Efficient Text-Driven 3D Generative Models.
[Laurea magistrale], Università di Bologna, Corso di Studio in
Artificial intelligence [LM-DM270]
Documenti full-text disponibili:
|
Documento PDF (Thesis)
Disponibile con Licenza: Salvo eventuali più ampie autorizzazioni dell'autore, la tesi può essere liberamente consultata e può essere effettuato il salvataggio e la stampa di una copia per fini strettamente personali di studio, di ricerca e di insegnamento, con espresso divieto di qualunque utilizzo direttamente o indirettamente commerciale. Ogni altro diritto sul materiale è riservato
Download (9MB)
|
Abstract
This work presents an in-depth analysis of recent text-driven 3D generation models, focusing on evaluating their performance under strict efficiency constraints, including hardware requirements and generation time.
The study aimed to compare the performance of these efficient models against the original outcomes without constraints and incorporated a new benchmark for evaluating text-driven 3D generation models based on the quality and coherence of the generated 3D shapes.
The research involved selecting promising models and testing the impact of efficiency constraints on their performance.
The results revealed that the quality of generated 3D shapes heavily depends on the input prompt and the specific model used, with some models performing significantly better on certain prompts due to their unique 3D shape parametrization techniques. Efficiency constraints generally resulted in coarser 3D shapes with more artifacts, especially in models requiring substantial computational resources. However, some models still managed to generate detailed and semantically accurate 3D shapes within these constraints.
The study also evaluated model performance using the T3Bench benchmark, observing lower Quality scores due to the coarser nature of the 3D shapes but higher Alignment scores, posing an interesting question related to the relationship between 3D efficient generation and prompt coherence evaluated on less detailed 3D shapes.
In conclusion, the research highlights significant potential for future exploration in efficient text-driven 3D generation models, particularly in improving detail precision and prompt semantic capture within efficiency constraints. It suggests the importance of developing a suitable prompt dataset for evaluating constrained models and emphasizes the challenge of balancing 3D quality with text-prompt coherence as key areas for future research.
Abstract
This work presents an in-depth analysis of recent text-driven 3D generation models, focusing on evaluating their performance under strict efficiency constraints, including hardware requirements and generation time.
The study aimed to compare the performance of these efficient models against the original outcomes without constraints and incorporated a new benchmark for evaluating text-driven 3D generation models based on the quality and coherence of the generated 3D shapes.
The research involved selecting promising models and testing the impact of efficiency constraints on their performance.
The results revealed that the quality of generated 3D shapes heavily depends on the input prompt and the specific model used, with some models performing significantly better on certain prompts due to their unique 3D shape parametrization techniques. Efficiency constraints generally resulted in coarser 3D shapes with more artifacts, especially in models requiring substantial computational resources. However, some models still managed to generate detailed and semantically accurate 3D shapes within these constraints.
The study also evaluated model performance using the T3Bench benchmark, observing lower Quality scores due to the coarser nature of the 3D shapes but higher Alignment scores, posing an interesting question related to the relationship between 3D efficient generation and prompt coherence evaluated on less detailed 3D shapes.
In conclusion, the research highlights significant potential for future exploration in efficient text-driven 3D generation models, particularly in improving detail precision and prompt semantic capture within efficiency constraints. It suggests the importance of developing a suitable prompt dataset for evaluating constrained models and emphasizes the challenge of balancing 3D quality with text-prompt coherence as key areas for future research.
Tipologia del documento
Tesi di laurea
(Laurea magistrale)
Autore della tesi
Conti, Matteo
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
3D Generation,Efficient Generation,Text-Driven Generation,3D Benchmark,3D Generative Models,Text-To-3D Generation
Data di discussione della Tesi
19 Marzo 2024
URI
Altri metadati
Tipologia del documento
Tesi di laurea
(NON SPECIFICATO)
Autore della tesi
Conti, Matteo
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
3D Generation,Efficient Generation,Text-Driven Generation,3D Benchmark,3D Generative Models,Text-To-3D Generation
Data di discussione della Tesi
19 Marzo 2024
URI
Statistica sui download
Gestione del documento: