Fast Neural Network Technique for Industrial OCR

Corsi, Giacomo (2018) Fast Neural Network Technique for Industrial OCR. [Laurea magistrale], Università di Bologna, Corso di Studio in Ingegneria elettronica [LM-DM270], Documento ad accesso riservato.

Salva citazione

Documenti full-text disponibili:

Documento PDF (Thesis)
Full-text non accessibile
Disponibile con Licenza: Creative Commons: Attribuzione - Condividi allo stesso modo 3.0 (CC BY-SA 3.0)
Download (3MB) | Contatta l'autore

Abstract

The content of my thesis describes the work done during my internship at Datalogic in Pasadena. This project improves the performance of the Optical Character Recognition (OCR) solution with use of Deep Learning (DL) techniques. It enhances the character detection process that had been previously developed and relies on template matching done on the Histogram of Gradients (HOG) features. This approach had been already validated with good performance, but detects only those characters which do not vary in the dataset. First, this document gives a introduction to OCR and DL topics, then describes the pipeline of the Datalogic OCR product. After that, it is explained the technique that was usedto raise the accuracy of the previous solution. It consists in applying DL to improve the robustness and keep good detection rate even though the character variations (scale and rotation) are considerable. The first phase was focused on speeding up the process and so the function used for gauging the matching with the templates, the Zero-mean Normalized Cross-Correlation, was replaced while a modified version, called Squared Normalization has been introduced. Secondly, the original system was cast as a Convolutional Neural Network (CNN) by turning the HOG templates into convolutional kernels. It was necessary to rethink its training process as it was noticed that, using standard target values, there was no gain. A novel way of computing the targets, named Graceful Improvement, has been developed. Then, the analysis on the results of this new solution showed that, even ifit detects characters that present variations with original templates, the false positive rate around the image was also higher. To decrease this negative side effect, a fast ROI (Region Of Interest) filter acting on the detections has been realized. Finally, during the above development steps, performances in terms of accuracy and time have been evaluated on some real Datalogic's customer datasets.

Abstract

Tipologia del documento

Tesi di laurea (Laurea magistrale)

Autore della tesi

Corsi, Giacomo

Relatore della tesi

Di Stefano, Luigi

Correlatore della tesi

Goncalves, Luis

Scuola

Ingegneria e Architettura

Corso di studio