Documenti full-text disponibili:
Abstract
We study diffusion models and causal transformers under the same lens by treating both architectures as discrete approximations of continuous stochastic processes. To do so, we introduce Continuous Causal Transformers (CCTs), a time- and space-continuous generalization of causal transformers, and provide qualitative evidence showing that vanilla causal transformers implicitly approximate CCTs. We then introduce Structured Autoregressivity, a collection of five properties that are shared by diffusion models and causal transformers, and show how they emerge naturally from our analysis. Finally, we describe the implications of our framework, identifying research directions for the design of both generative and non-generative models.
Abstract
We study diffusion models and causal transformers under the same lens by treating both architectures as discrete approximations of continuous stochastic processes. To do so, we introduce Continuous Causal Transformers (CCTs), a time- and space-continuous generalization of causal transformers, and provide qualitative evidence showing that vanilla causal transformers implicitly approximate CCTs. We then introduce Structured Autoregressivity, a collection of five properties that are shared by diffusion models and causal transformers, and show how they emerge naturally from our analysis. Finally, we describe the implications of our framework, identifying research directions for the design of both generative and non-generative models.
Tipologia del documento
Tesi di laurea
(Laurea magistrale)
Autore della tesi
Marro, Samuele
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
diffusion models,language models,transformers,causal language modelling,stochastic processes
Data di discussione della Tesi
23 Luglio 2024
URI
Altri metadati
Tipologia del documento
Tesi di laurea
(NON SPECIFICATO)
Autore della tesi
Marro, Samuele
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
diffusion models,language models,transformers,causal language modelling,stochastic processes
Data di discussione della Tesi
23 Luglio 2024
URI
Statistica sui download
Gestione del documento: