Ruberg, Nicolaas
(2021)
Bert goes sustainable: an NLP approach to ESG financing.
[Laurea magistrale], Università di Bologna, Corso di Studio in
Artificial intelligence [LM-DM270]
Documenti full-text disponibili:
Abstract
Environmental, Social, and Governance (ESG) factors are a strategic topic for investors and financing institutions like the Brazilian Development Bank (BNDES). Currently, the bank’s experts are developing a framework based on those factors to assess companies' sustainable financing. We identify an opportunity to use Natural Language Processing (NLP) in this development. This opportunity arises from the observation that a critical document to the ESG analysis is the company annual activity report. This document undergoes a manual screening, and later it is decomposed, and its parts are redirected to specialists’ analysis. Therefore, the screening process would largely benefit from NLP to automate the classification of text excerpts from the annual report.
The proposed solution is based on different Bidirectional Encoder Representations from Transformers (BERT) architectures, which rely on the attention mechanism to achieve optimal results on sentence-level analysis tasks. We devised a text classification task to enable the analysis of excerpts from the annual activity report of companies considering three categories, according to the ESG reference standard, the Global Reporting Initiative (GRI).
To establish a benchmark, we implemented a baseline solution using a classic NLP approach, Naïve Bayes, which got a 51% accuracy and 50,33% F1-score. RoBERTa and BERT-large achieved 88% accuracy and almost 85% F1-score, the best results obtained from our experiments with different BERT architectures. Also, Albert showed to be a possible alternative for limited memory devices, with 85% accuracy and 78.5924% F1-score.
Finally, we experimented with a multilingual setup that would be interesting for a scenario where the BNDES wants a more generic model that can analyze English or Portuguese annual reports. Bert multilingual model reached almost 86% accuracy and 81.18% F1-score.
Abstract
Environmental, Social, and Governance (ESG) factors are a strategic topic for investors and financing institutions like the Brazilian Development Bank (BNDES). Currently, the bank’s experts are developing a framework based on those factors to assess companies' sustainable financing. We identify an opportunity to use Natural Language Processing (NLP) in this development. This opportunity arises from the observation that a critical document to the ESG analysis is the company annual activity report. This document undergoes a manual screening, and later it is decomposed, and its parts are redirected to specialists’ analysis. Therefore, the screening process would largely benefit from NLP to automate the classification of text excerpts from the annual report.
The proposed solution is based on different Bidirectional Encoder Representations from Transformers (BERT) architectures, which rely on the attention mechanism to achieve optimal results on sentence-level analysis tasks. We devised a text classification task to enable the analysis of excerpts from the annual activity report of companies considering three categories, according to the ESG reference standard, the Global Reporting Initiative (GRI).
To establish a benchmark, we implemented a baseline solution using a classic NLP approach, Naïve Bayes, which got a 51% accuracy and 50,33% F1-score. RoBERTa and BERT-large achieved 88% accuracy and almost 85% F1-score, the best results obtained from our experiments with different BERT architectures. Also, Albert showed to be a possible alternative for limited memory devices, with 85% accuracy and 78.5924% F1-score.
Finally, we experimented with a multilingual setup that would be interesting for a scenario where the BNDES wants a more generic model that can analyze English or Portuguese annual reports. Bert multilingual model reached almost 86% accuracy and 81.18% F1-score.
Tipologia del documento
Tesi di laurea
(Laurea magistrale)
Autore della tesi
Ruberg, Nicolaas
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Natural Language Processing,Sustainability,Environment,Social,Governance,Finincing,BERT
Data di discussione della Tesi
3 Dicembre 2021
URI
Altri metadati
Tipologia del documento
Tesi di laurea
(NON SPECIFICATO)
Autore della tesi
Ruberg, Nicolaas
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Natural Language Processing,Sustainability,Environment,Social,Governance,Finincing,BERT
Data di discussione della Tesi
3 Dicembre 2021
URI
Statistica sui download
Gestione del documento: