Contarino, Antonio Giovanni
(2021)
Neural machine translation adaptation and automatic terminology evaluation: a case study on Italian and South Tyrolean German legal texts.
[Laurea magistrale], Università di Bologna, Corso di Studio in
Specialized translation [LM-DM270] - Forli'
Documenti full-text disponibili:
|
Documento PDF (Thesis)
Disponibile con Licenza: Salvo eventuali più ampie autorizzazioni dell'autore, la tesi può essere liberamente consultata e può essere effettuato il salvataggio e la stampa di una copia per fini strettamente personali di studio, di ricerca e di insegnamento, con espresso divieto di qualunque utilizzo direttamente o indirettamente commerciale. Ogni altro diritto sul materiale è riservato
Download (1MB)
|
Abstract
Following the implementation of South Tyrol’s Statute of Autonomy, the public administrations of the Autonomous Province of Bozen/Bolzano are legally bound to the bilingual publication of laws and administrative acts. This results in a strong demand for translation of legal-administrative texts, usually from Italian into German, which could be satisfied, at some extent, by integrating machine translation (MT) in the institutional translation workflow. In this setting, a crucial aspect is also represented by the local South Tyrolean legal-administrative terminology, which is of central importance in institutional translation, exhibits peculiar features with respect to other German-speaking countries, and has emerged as the main issue when machine-translating Italian legal-administrative texts into South Tyrolean German.
The purpose of the present study is to adapt an MT system (ModernMT) by means of a parallel corpus of legal-administrative texts and to evaluate it both in terms of overall MT performance and in terms of legal terminology evaluation, by automatically matching and categorising the legal terms produced by the MT engine within a fine-grained taxonomy.
Results showed that the domain-adapted engine achieved a substantial and promising improvement in MT performance (+9 BLEU), yielding a relatively good score of 35 BLEU. As for legal term translation, the proposed automatic evaluation approach provided insights about terminology improvements both on a quantitative and qualitative level. The domain-adapted engine correctly translated 2746 out of 3503 legal terms of the test set (term accuracy: 78.39%), significantly outperforming the generic ModernMT system by 3.31%. The most substantial improvements were observed with regards to standardised/recommended legal terms. Despite the significant improvements in term translation accuracy, however, the adopted domain-adaptation approach did not achieve a systematic enhancement in legal terminology translation.
Abstract
Following the implementation of South Tyrol’s Statute of Autonomy, the public administrations of the Autonomous Province of Bozen/Bolzano are legally bound to the bilingual publication of laws and administrative acts. This results in a strong demand for translation of legal-administrative texts, usually from Italian into German, which could be satisfied, at some extent, by integrating machine translation (MT) in the institutional translation workflow. In this setting, a crucial aspect is also represented by the local South Tyrolean legal-administrative terminology, which is of central importance in institutional translation, exhibits peculiar features with respect to other German-speaking countries, and has emerged as the main issue when machine-translating Italian legal-administrative texts into South Tyrolean German.
The purpose of the present study is to adapt an MT system (ModernMT) by means of a parallel corpus of legal-administrative texts and to evaluate it both in terms of overall MT performance and in terms of legal terminology evaluation, by automatically matching and categorising the legal terms produced by the MT engine within a fine-grained taxonomy.
Results showed that the domain-adapted engine achieved a substantial and promising improvement in MT performance (+9 BLEU), yielding a relatively good score of 35 BLEU. As for legal term translation, the proposed automatic evaluation approach provided insights about terminology improvements both on a quantitative and qualitative level. The domain-adapted engine correctly translated 2746 out of 3503 legal terms of the test set (term accuracy: 78.39%), significantly outperforming the generic ModernMT system by 3.31%. The most substantial improvements were observed with regards to standardised/recommended legal terms. Despite the significant improvements in term translation accuracy, however, the adopted domain-adaptation approach did not achieve a systematic enhancement in legal terminology translation.
Tipologia del documento
Tesi di laurea
(Laurea magistrale)
Autore della tesi
Contarino, Antonio Giovanni
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
neural machine translation,terminology evaluation,legal terminology,traduzione automatica,terminologia giuridica
Data di discussione della Tesi
16 Dicembre 2021
URI
Altri metadati
Tipologia del documento
Tesi di laurea
(NON SPECIFICATO)
Autore della tesi
Contarino, Antonio Giovanni
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
neural machine translation,terminology evaluation,legal terminology,traduzione automatica,terminologia giuridica
Data di discussione della Tesi
16 Dicembre 2021
URI
Statistica sui download
Gestione del documento: