LLM-Assisted ESG Data Extraction from Corporate Reports

Imanbayeva, Leilya (2025) LLM-Assisted ESG Data Extraction from Corporate Reports. [Laurea magistrale], Università di Bologna, Corso di Studio in Digital transformation management [LM-DM270] - Cesena, Documento full-text non disponibile
Il full-text non è disponibile per scelta dell'autore. (Contatta l'autore)

Abstract

This thesis focuses on the development and assessment of a comprehensive pipeline aimed at extracting Environmental, Social, and Governance (ESG) key performance indicators (KPIs) from corporate websites and sustainability reports. The proposed pipeline integrates web discovery, a Large Language Model (LLM) for information extraction, and a deterministic post-processing mechanism to standardize noisy outputs into structured, analysis-ready tables. A case study carried out at Illuminem is used to evaluate field-level accuracy, with an exploration of challenges stemming from unstructured data formats, variability in model outputs, and inconsistencies in reporting practices.

Abstract
Tipologia del documento
Tesi di laurea (Laurea magistrale)
Autore della tesi
Imanbayeva, Leilya
Relatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
ESG,sustainability,reporting,data,extraction,large,language models,NLP,information,retrieval,automation,data,validation,hybrid, pipeline,corporate,disclosures
Data di discussione della Tesi
27 Ottobre 2025
URI

Altri metadati

Gestione del documento: Visualizza il documento

^