Mixing Pruning and Distillation for Lighter Diffusion Models

Dell'Olio, Domenico (2024) Mixing Pruning and Distillation for Lighter Diffusion Models. [Laurea magistrale], Università di Bologna, Corso di Studio in Artificial intelligence [LM-DM270]
Documenti full-text disponibili:
[img] Documento PDF (Thesis)
Disponibile con Licenza: Creative Commons: Attribuzione - Non commerciale - Non opere derivate 4.0 (CC BY-NC-ND 4.0)

Download (7MB)

Abstract

Diffusion Models (DMs) represent the state-of-the-art in image generation tasks in terms of training stability and sample quality, but their sampling procedure is highly resource-intensive. This thesis addresses the efficiency problem of DMs by proposing a novel method combining Progressive Distillation with structured pruning, to reduce the computational and memory overhead without severely compromising image quality. The Progressive Distillation method, introduced in Progressive Distillation for Fast Sampling of Diffusion Models by Salimans and Ho, reduces the number of required sampling steps by iteratively training a student DM to match the teacher model's output in half the steps. However, this method requires the student to retain the same network architecture as the teacher, limiting further compression. For this reason, we introduce a structured pruning technique during the distillation process, incorporating concepts such as pruning ratio differentiation based on the layer location and normalization-layer-lead pruning. We also introduce Flexible Group Normalization (FGN), a variation to the Group Normalization layer, to handle uneven channel groups post-pruning. We validate our method with experiments on the CIFAR-10 dataset, conducting pruning sensitivity and weight magnitude variation analyses, and comparing different pruning scoring criteria to refine our approach. Though sacrificing some sample quality and not particularly optimized, the pruned models achieve significant reductions in computational requirements, with the best quality/compression trade-offs observed in 8-step and 4-step models. Our method provides a possible solution for DM efficiency and gives cues for further research exploring this family of combined complexity-reduction techniques.

Abstract
Tipologia del documento
Tesi di laurea (Laurea magistrale)
Autore della tesi
Dell'Olio, Domenico
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Diffusion Models,Image Generation,Progressive Distillation,Model Distillation,Model Pruning,Complexity-Reduction,Flexible Group Normalization,CIFAR-10,Magnitude Pruning
Data di discussione della Tesi
23 Luglio 2024
URI

Altri metadati

Statistica sui download

Gestione del documento: Visualizza il documento

^