A study on the application of generative adversarial networks to industrial OCR

Albertazzi, Riccardo (2018) A study on the application of generative adversarial networks to industrial OCR. [Laurea magistrale], Università di Bologna, Corso di Studio in Ingegneria informatica [LM-DM270], Documento full-text non disponibile
Il full-text non è disponibile per scelta dell'autore. (Contatta l'autore)


High performance and nearly perfect accuracy are the standards required by OCR algorithms for industrial applications. In the last years research on Deep Learning has proven that Convolutional Neural Networks (CNNs) are a very powerful and robust tool for image analysis and classification; when applied to OCR tasks, CNNs are able to perform much better than previously adopted techniques and reach easily 99% accuracy. However, Deep Learning models' effectiveness relies on the quality of the data used to train them; this can become a problem since OCR tools can run for months without interruption, and during this period unpredictable variations (printer errors, background modifications, light conditions) could affect the accuracy of the trained system. We cannot expect that the final user who trains the tool will take thousands of training pictures under different conditions until all imaginable variations have been captured; we then have to be able to generate these variations programmatically. Generative Adversarial Networks (GANs) are a recent breakthrough in machine learning; these networks are able to learn the distribution of the input data and therefore generate realistic samples belonging to that distribution. This thesis' objective is learning how GANs work in detail and perform experiments on generative models that allow to create unseen variations of OCR training characters, thus allowing the whole OCR system to be more robust to future character variations.

Tipologia del documento
Tesi di laurea (Laurea magistrale)
Autore della tesi
Albertazzi, Riccardo
Relatore della tesi
Correlatore della tesi
Corso di studio
Ordinamento Cds
Parole chiave
OCR,Optical Character Recognition,Machine Learning,Deep Learning,Convolutional Neural Networks,CNN,GAN,Generative Adversarial Networks,Image Processing,Computer Vision,Conditional Generative Adversarial Networks,CGAN,SimGAN
Data di discussione della Tesi
23 Luglio 2018

Altri metadati

Gestione del documento: Visualizza il documento