Deana, Alessia
(2024)
Form extractor: information extraction from digital documents.
[Laurea magistrale], Università di Bologna, Corso di Studio in
Artificial intelligence [LM-DM270], Documento full-text non disponibile
Il full-text non è disponibile per scelta dell'autore.
(
Contatta l'autore)
Abstract
In the digital age, businesses are increasingly focused on automating the processing of unstructured documents, such as invoices, contracts, and other text-heavy records. This thesis, completed during my internship at Efficiento, explores the development of an advanced solution for intelligent information extraction from digital documents.
The project leverages state-of-the-art techniques in Computer Vision (CV) and Natural Language Processing (NLP) to automate the extraction of key data that would traditionally require manual input. By utilizing the MMOCR (Multimodal OCR) framework, the solution effectively handles various document formats, enabling accurate detection, recognition, and extraction of critical information.
The system integrates DBNet for text detection, ABINet for text recognition, and the Spatial Dual-Modality Graph Reasoning (SDMG-R) model to enhance information extraction by combining both visual and textual data.
Tested on a combination of public and proprietary datasets, the solution achieved high accuracy in extracting essential information. The results demonstrate a significant improvement in both efficiency and precision compared to manual methods, offering a scalable solution adaptable across a wide range of industries.
Abstract
In the digital age, businesses are increasingly focused on automating the processing of unstructured documents, such as invoices, contracts, and other text-heavy records. This thesis, completed during my internship at Efficiento, explores the development of an advanced solution for intelligent information extraction from digital documents.
The project leverages state-of-the-art techniques in Computer Vision (CV) and Natural Language Processing (NLP) to automate the extraction of key data that would traditionally require manual input. By utilizing the MMOCR (Multimodal OCR) framework, the solution effectively handles various document formats, enabling accurate detection, recognition, and extraction of critical information.
The system integrates DBNet for text detection, ABINet for text recognition, and the Spatial Dual-Modality Graph Reasoning (SDMG-R) model to enhance information extraction by combining both visual and textual data.
Tested on a combination of public and proprietary datasets, the solution achieved high accuracy in extracting essential information. The results demonstrate a significant improvement in both efficiency and precision compared to manual methods, offering a scalable solution adaptable across a wide range of industries.
Tipologia del documento
Tesi di laurea
(Laurea magistrale)
Autore della tesi
Deana, Alessia
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Key Information Extraction (KIE),Document Processing,Computer Vision,Text Detection,Text Recognition,MMOCR Framework,Automated Extraction
Data di discussione della Tesi
8 Ottobre 2024
URI
Altri metadati
Tipologia del documento
Tesi di laurea
(NON SPECIFICATO)
Autore della tesi
Deana, Alessia
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Key Information Extraction (KIE),Document Processing,Computer Vision,Text Detection,Text Recognition,MMOCR Framework,Automated Extraction
Data di discussione della Tesi
8 Ottobre 2024
URI
Gestione del documento: