A deep learning solution for industrial OCR applications

Lamberti, Lorenzo (2019) A deep learning solution for industrial OCR applications. [Laurea magistrale], Università di Bologna, Corso di Studio in Ingegneria elettronica [LM-DM270], Documento ad accesso riservato.

Salva citazione

Documenti full-text disponibili:

Documento PDF (Thesis)
Full-text accessibile solo agli utenti istituzionali dell'Ateneo
Disponibile con Licenza: Salvo eventuali più ampie autorizzazioni dell'autore, la tesi può essere liberamente consultata e può essere effettuato il salvataggio e la stampa di una copia per fini strettamente personali di studio, di ricerca e di insegnamento, con espresso divieto di qualunque utilizzo direttamente o indirettamente commerciale. Ogni altro diritto sul materiale è riservato
Download (29MB) | Contatta l'autore

Abstract

This thesis describes a project developed throughout a six months internship in the Machine Vision Laboratory of Datalogic based in Pasadena, California. The project aims to develop a deep learning system as a possible solution for industrial optical character recognition applications. In particular, the focus falls on a specific algorithm called You Only Look Once (YOLO), which is a general-purpose object detector based on convolutional neural networks that currently offers state-of-the-art performances in terms of trade-off between speed and accuracy. This algorithm is indeed well known for reaching impressive processing speeds, but its intrinsic structure makes it struggle in detecting small objects clustered together, which unfortunately matches our scenario: we are trying to read alphanumerical codes by detecting each single character and then reconstructing the final string. The final goal of this thesis is to overcome this drawback and push the accuracy performances of a general object detector convolutional neural network to its limits, in order to meet the demanding requirements of industrial OCR applications. To accomplish this, first YOLO's unique detecting approach was mastered in its original framework called Darknet, written in C and CUDA, then all the code was translated into Python programming language for a better flexibility, which also allowed the deployment of a custom architecture. Four different datasets with increasing complexity were used as case-studies and the final performances reached were surprising: the accuracy varies between 99.75\% and 99.97\% with a processing time of 15 ms for images $1000\times1000$ big, largely outperforming in speed the current deep learning solution deployed by Datalogic. On the downsides, the training phase usually requires a very large amount of data and time and YOLO also showed some memorization behaviours if not enough variability is given at training time.

Abstract

Tipologia del documento

Tesi di laurea (Laurea magistrale)

Autore della tesi

Lamberti, Lorenzo

Relatore della tesi

Di Stefano, Luigi

Correlatore della tesi

Goncalves, Luis ; Mambelli, Filippo

Scuola

Ingegneria e Architettura

Corso di studio

Ingegneria elettronica [LM-DM270]

Ordinamento Cds

DM270

Parole chiave

Deep Leanrning,Convolutional Neural Networks,Optical Character Recognition,Object Detection,You Only Look Once,YOLO,Image Processing,Computer Vision,Industrial OCR,Artificial Intelligence

Data di discussione della Tesi

19 Dicembre 2019

URI

https://amslaurea.unibo.it/id/eprint/19777

Altri metadati

Statistica sui download

Vedi altre statistiche

Gestione del documento:

Strumenti di navigazione

Collezioni AlmaDL

A deep learning solution for industrial OCR applications

Abstract

Altri metadati

Statistica sui download