Reducing Memorization in Latent Diffusion Models for 3D Medical Images Generation

Melacini, Giacomo (2024) Reducing Memorization in Latent Diffusion Models for 3D Medical Images Generation. [Laurea magistrale], Università di Bologna, Corso di Studio in Artificial intelligence [LM-DM270]
Documenti full-text disponibili:
[thumbnail of Thesis] Documento PDF (Thesis)
Disponibile con Licenza: Creative Commons: Attribuzione - Non commerciale - Condividi allo stesso modo 4.0 (CC BY-NC-SA 4.0)

Download (8MB)

Abstract

In the medical domain, the use of machine learning techniques for diagnosis, treatment planning, and medical imaging interpretation is becoming increasingly important. However, these approaches require a large amount of data, which is challenging to access due to its sensitive nature and related privacy concerns. Synthetic data generation, enabled by advances in generative techniques, provides a solution to create large anonymized datasets for training models without compromising patient privacy. Nonetheless, the presence of memorization in such datasets, meaning the exact replication of training images, has been assessed by many studies. This dissertation explores the use of Latent Diffusion Models (LDMs) for generating medical data, focusing on head CT scans, and investigates the phenomenon of memorization in synthetic datasets together with methodologies to detect and mitigate it. The study proposes an adaptation of the Lowe's ratio test to detect potential copies and evaluates two approaches, Privacy Distillation and Latent Filtering, for their effectiveness in addressing memorization issues. The findings contribute to understanding the potential of LDMs in generating realistic medical data while reducing concerns regarding their sharing. Results validate the Lowe's ratio test as a metric for assessing memorization and demonstrate the efficacy of the investigated memorization-countering techniques.

Abstract
Tipologia del documento
Tesi di laurea (Laurea magistrale)
Autore della tesi
Melacini, Giacomo
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Latent Diffusion Models,Machine Learning,Computer Vision,Lowe's ratio,DDPM,VQ-VAE,LDM,Medical data,3D,Memorization,Generation,3D medical images,Privacy distillation,Latent Filtering,Generative models,Correlation
Data di discussione della Tesi
19 Marzo 2024
URI

Altri metadati

Statistica sui download

Gestione del documento: Visualizza il documento

^