CVAT meets transformers: accelerating semantic segmentation labeling in industrial applications

Procino, Edoardo (2024) CVAT meets transformers: accelerating semantic segmentation labeling in industrial applications. [Laurea magistrale], Università di Bologna, Corso di Studio in Artificial intelligence [LM-DM270]

Salva citazione

Documenti full-text disponibili:

Documento PDF (Thesis)
Disponibile con Licenza: Salvo eventuali più ampie autorizzazioni dell'autore, la tesi può essere liberamente consultata e può essere effettuato il salvataggio e la stampa di una copia per fini strettamente personali di studio, di ricerca e di insegnamento, con espresso divieto di qualunque utilizzo direttamente o indirettamente commerciale. Ogni altro diritto sul materiale è riservato
Download (30MB)

Abstract

Semantic segmentation plays a crucial role in various industrial applications by enabling detailed analysis and understanding of images at the pixel level. This thesis presents a novel approach to streamline the semantic segmentation labeling process by integrating Computer Vision Annotation Tool (CVAT) with advanced Transformer models, specifically focusing on industrial applications. Traditional methods for preparing datasets for semantic segmentations relying on manual annotation are time-consuming and economically burdensome. To address these challenges, we explore the integration of CVAT, an online tool designed for efficient image labeling, with Transformer-based models. Our methodology involves a semi-automatic pipeline leveraging Segment Anything Model (SAM) within CVAT for initial annotations, followed by fine-tuning of YOLO , SegFormer, and the ViT-Adapter which then can be used to label new images to help the labeler in his/her work. We detail the process of fine-tuning these models and we discuss about the improvements in terms of time saving.

Abstract

Tipologia del documento

Tesi di laurea (Laurea magistrale)

Autore della tesi

Procino, Edoardo

Relatore della tesi

Di Stefano, Luigi

Correlatore della tesi

Casadio, Giuseppe

Scuola

Ingegneria e Architettura

Corso di studio

Artificial intelligence [LM-DM270]

Ordinamento Cds

DM270

Parole chiave

cvat,computer vision,transformers,vision transformers,ViT,segformer,yolo,YOLOv8,ViT-Adapeter,Adapter,labeling,annotations,deep learning,artificial intelligence

Data di discussione della Tesi

19 Marzo 2024

URI

https://amslaurea.unibo.it/id/eprint/31684

Altri metadati

Statistica sui download

Vedi altre statistiche

Gestione del documento:

Strumenti di navigazione

Collezioni AlmaDL

CVAT meets transformers: accelerating semantic segmentation labeling in industrial applications

Abstract

Altri metadati

Statistica sui download