Corsi, Giacomo
 
(2018)
Fast Neural Network Technique for Industrial OCR.
[Laurea magistrale], Università di Bologna, Corso di Studio in 
Ingegneria elettronica [LM-DM270], Documento ad accesso riservato.
  
 
  
  
        
        
	
  
  
  
  
  
  
  
    
  
    
      Documenti full-text disponibili:
      
    
  
  
    
      Abstract
      The content of my thesis describes the work done during my internship at Datalogic in Pasadena. This project improves the performance of the Optical Character Recognition (OCR) solution with use of Deep Learning (DL) techniques. It enhances the character detection process that had been previously developed and relies on template matching done on the Histogram of Gradients (HOG) features. This approach had been already validated with good performance, but detects only those characters which do not vary in the dataset.
First, this document gives a introduction to OCR and DL topics, then describes the pipeline of the Datalogic OCR product. 
After that, it is explained the technique that was usedto raise the accuracy of the previous solution. It consists in applying DL to improve the robustness and keep good detection rate even though the character variations (scale and rotation) are considerable.
The first phase was focused on speeding up the process and so the function used for gauging the matching with the templates, the Zero-mean Normalized Cross-Correlation, was replaced while a modified version, called Squared Normalization has been introduced.
Secondly, the original system was cast as a Convolutional Neural Network (CNN) by turning the HOG templates into convolutional kernels. It was necessary to rethink its training process as it was noticed that, using standard target values, there was no gain. A novel way of computing the targets, named Graceful Improvement, has been developed.
Then, the analysis on the results of this new solution showed that, even ifit detects characters that present variations with original templates, the false positive rate around the image was also higher. To decrease this negative side effect, a fast ROI (Region Of Interest) filter acting on the detections has been realized.
Finally, during the above development steps, performances in terms of accuracy and time have been evaluated on some real Datalogic's customer datasets.
     
    
      Abstract
      The content of my thesis describes the work done during my internship at Datalogic in Pasadena. This project improves the performance of the Optical Character Recognition (OCR) solution with use of Deep Learning (DL) techniques. It enhances the character detection process that had been previously developed and relies on template matching done on the Histogram of Gradients (HOG) features. This approach had been already validated with good performance, but detects only those characters which do not vary in the dataset.
First, this document gives a introduction to OCR and DL topics, then describes the pipeline of the Datalogic OCR product. 
After that, it is explained the technique that was usedto raise the accuracy of the previous solution. It consists in applying DL to improve the robustness and keep good detection rate even though the character variations (scale and rotation) are considerable.
The first phase was focused on speeding up the process and so the function used for gauging the matching with the templates, the Zero-mean Normalized Cross-Correlation, was replaced while a modified version, called Squared Normalization has been introduced.
Secondly, the original system was cast as a Convolutional Neural Network (CNN) by turning the HOG templates into convolutional kernels. It was necessary to rethink its training process as it was noticed that, using standard target values, there was no gain. A novel way of computing the targets, named Graceful Improvement, has been developed.
Then, the analysis on the results of this new solution showed that, even ifit detects characters that present variations with original templates, the false positive rate around the image was also higher. To decrease this negative side effect, a fast ROI (Region Of Interest) filter acting on the detections has been realized.
Finally, during the above development steps, performances in terms of accuracy and time have been evaluated on some real Datalogic's customer datasets.
     
  
  
    
    
      Tipologia del documento
      Tesi di laurea
(Laurea magistrale)
      
      
      
      
        
      
        
          Autore della tesi
          Corsi, Giacomo
          
        
      
        
          Relatore della tesi
          
          
        
      
        
          Correlatore della tesi
          
          
        
      
        
          Scuola
          
          
        
      
        
          Corso di studio
          
          
        
      
        
          Indirizzo
          Curriculum: Electronics and communication science and technology
          
        
      
        
      
        
          Ordinamento Cds
          DM270
          
        
      
        
          Parole chiave
          Deep Learning,Neural Networks,CNN,Computer Vision,OCR,Histogram Of Gradients
          
        
      
        
          Data di discussione della Tesi
          16 Marzo 2018
          
        
      
      URI
      
      
     
   
  
    Altri metadati
    
      Tipologia del documento
      Tesi di laurea
(NON SPECIFICATO)
      
      
      
      
        
      
        
          Autore della tesi
          Corsi, Giacomo
          
        
      
        
          Relatore della tesi
          
          
        
      
        
          Correlatore della tesi
          
          
        
      
        
          Scuola
          
          
        
      
        
          Corso di studio
          
          
        
      
        
          Indirizzo
          Curriculum: Electronics and communication science and technology
          
        
      
        
      
        
          Ordinamento Cds
          DM270
          
        
      
        
          Parole chiave
          Deep Learning,Neural Networks,CNN,Computer Vision,OCR,Histogram Of Gradients
          
        
      
        
          Data di discussione della Tesi
          16 Marzo 2018
          
        
      
      URI
      
      
     
   
  
  
  
  
  
  
    
      Gestione del documento: 
      
        