Automatic generation of synthetic datasets for digital pathology image analysis

D'Agostino, Alessandro (2020) Automatic generation of synthetic datasets for digital pathology image analysis. [Laurea magistrale], Università di Bologna, Corso di Studio in Physics [LM-DM270]
Documenti full-text disponibili:
[img] Documento PDF (Thesis)
Disponibile con Licenza: Salvo eventuali più ampie autorizzazioni dell'autore, la tesi può essere liberamente consultata e può essere effettuato il salvataggio e la stampa di una copia per fini strettamente personali di studio, di ricerca e di insegnamento, con espresso divieto di qualunque utilizzo direttamente o indirettamente commerciale. Ogni altro diritto sul materiale è riservato

Download (28MB)


The project is inspired by an actual problem of timing and accessibility in the analysis of histological samples in the health-care system. In this project, I address the problem of synthetic histological image generation for the purpose of training Neural Networks for the segmentation of real histological images. The collection of real histological human-labeled samples is a very time consuming and expensive process and often is not representative of healthy samples, for the intrinsic nature of the medical analysis. The method I propose is based on the replication of the traditional specimen preparation technique in a virtual environment. The first step is the creation of a 3D virtual model of a region of the target human tissue. The model should represent all the key features of the tissue, and the richer it is the better will be the yielded result. The second step is to perform a sampling of the model through a virtual tomography process, which produces a first completely labeled image of the section. This image is then processed with different tools to achieve a histological-like aspect. The most significant aesthetical post-processing is given by the action of a style transfer neural network that transfers the typical histological visual texture on the synthetic image. This procedure is presented in detail for two specific models: one of pancreatic tissue and one of dermal tissue. The two resulting images compose a pair of images suitable for a supervised learning technique. The generation process is completely automatized and does not require the intervention of any human operator, hence it can be used to produce arbitrary large datasets. The synthetic images are inevitably less complex than the real samples and they offer an easier segmentation task to solve for the NN. However, the synthetic images are very abundant, and the training of a NN can take advantage of this feature, following the so-called curriculum learning strategy.

Tipologia del documento
Tesi di laurea (Laurea magistrale)
Autore della tesi
D'Agostino, Alessandro
Relatore della tesi
Correlatore della tesi
Corso di studio
Applied Physics
Ordinamento Cds
Parole chiave
DPIA,Digital Pathology,Deep Learning,Histological Images,Segmentation
Data di discussione della Tesi
23 Ottobre 2020

Altri metadati

Statistica sui download

Gestione del documento: Visualizza il documento