Efficient dataset labeling for robotic object perception exploiting eye-in-hand RGB-D camera

Yazdizadeh Baghini, Amineh (2025) Efficient dataset labeling for robotic object perception exploiting eye-in-hand RGB-D camera. [Laurea magistrale], Università di Bologna, Corso di Studio in Automation engineering / ingegneria dell’automazione [LM-DM270], Documento full-text non disponibile
Il full-text non è disponibile per scelta dell'autore. (Contatta l'autore)

Abstract

Robotic perception faces significant challenges in accurately handling diverse objects with complex geometries, particularly in dynamic environments. Traditional dataset labeling methods struggle with varying object geometries, occlusions, and inefficiencies in annotation. To address these issues, this study proposes an eye-in-hand RGB-D camera setup for efficient dataset labeling, integrating stereo vision and data fusion techniques to enhance perception accuracy. The methodology employs a Universal Robots UR5 robotic arm equipped with a Luxonis OAK-D RGB-D camera to capture multi-view data along an ellipsoidal trajectory. A semi-automated annotation strategy is introduced, leveraging the High-Quality Segment Anything Model (HQ-SAM) for adaptive segmentation. The process begins with sparse manual annotations in the first frame, which are iteratively propagated across subsequent frames using a randomized point selection method. This method refines key points based on depth covariance constraints, ensuring robust adaptation to occlusions and object variations. The proposed approach significantly improves dataset annotation efficiency, reduces manual labeling effort, and enhances segmentation accuracy. By dynamically generating annotation points across frames, the system effectively mitigates occlusion-related errors and ensures consistent labeling across complex object geometries. Experimental results demonstrate the effectiveness of this approach in capturing high-quality RGB-D datasets for robotic perception, particularly for diverse objects with complex structures.

Abstract
Tipologia del documento
Tesi di laurea (Laurea magistrale)
Autore della tesi
Yazdizadeh Baghini, Amineh
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Robotic Perception, RGB-D Cameras, Dataset Labeling, Complex Object Geometries, Semi-Automated Annotation, Annotation Propagation, Occlusion Handling, HQ-SAM (High-Quality Segment Anything Model)
Data di discussione della Tesi
24 Marzo 2025
URI

Altri metadati

Gestione del documento: Visualizza il documento

^