Bardi, Alessandra
(2024)
Terminology and Neural Machine Translation: Integrating Textile Terminology into WIPO Translate.
[Laurea magistrale], Università di Bologna, Corso di Studio in
Specialized translation [LM-DM270] - Forli', Documento full-text non disponibile
Il full-text non è disponibile per scelta dell'autore.
(
Contatta l'autore)
Abstract
Although neural machine translation (NMT) systems have shown impressive results in the translation of general language compared to previous MT paradigms, the overall fluency of their output often hides serious terminological errors. This hinders the possibility of systematically integrating NMT in translation workflows centred around specialized domains: often it is more efficient to translate a terminologically dense text from scratch than spend significant post-editing efforts in replacing the wrong terminology of a raw MT output. At the same time, many companies and organizations have structured terminology resources at their disposal, which is why research has been focusing on finding ways to integrate them into NMT systems in order to improve the quality of the translation of terminology. The present work explores how terminology and MT are combined at the World Intellectual Property Organization, a specialized agency of the United Nations which includes a Division entrusted with the translation of patent documents. In order to test WIPO NMT system on the translation of specialized terminology, a bilingual list of textile terms was obtained by adding Portuguese equivalents to 50 English records already featured in WIPO termbase. The validated list was added to the training data of WIPO EN>PT translation model and a comparison was made between the outputs of the model before and after being retrained on these data. After illustrating the results of the evaluation, the work suggests possible experiments drawn from the reviewed literature for a better integration of terminology into WIPO NMT system.
Abstract
Although neural machine translation (NMT) systems have shown impressive results in the translation of general language compared to previous MT paradigms, the overall fluency of their output often hides serious terminological errors. This hinders the possibility of systematically integrating NMT in translation workflows centred around specialized domains: often it is more efficient to translate a terminologically dense text from scratch than spend significant post-editing efforts in replacing the wrong terminology of a raw MT output. At the same time, many companies and organizations have structured terminology resources at their disposal, which is why research has been focusing on finding ways to integrate them into NMT systems in order to improve the quality of the translation of terminology. The present work explores how terminology and MT are combined at the World Intellectual Property Organization, a specialized agency of the United Nations which includes a Division entrusted with the translation of patent documents. In order to test WIPO NMT system on the translation of specialized terminology, a bilingual list of textile terms was obtained by adding Portuguese equivalents to 50 English records already featured in WIPO termbase. The validated list was added to the training data of WIPO EN>PT translation model and a comparison was made between the outputs of the model before and after being retrained on these data. After illustrating the results of the evaluation, the work suggests possible experiments drawn from the reviewed literature for a better integration of terminology into WIPO NMT system.
Tipologia del documento
Tesi di laurea
(Laurea magistrale)
Autore della tesi
Bardi, Alessandra
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Indirizzo
CURRICULUM SPECIALIZED TRANSLATION
Ordinamento Cds
DM270
Parole chiave
terminology,machine translation,nmt,patents,wipo
Data di discussione della Tesi
18 Marzo 2024
URI
Altri metadati
Tipologia del documento
Tesi di laurea
(NON SPECIFICATO)
Autore della tesi
Bardi, Alessandra
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Indirizzo
CURRICULUM SPECIALIZED TRANSLATION
Ordinamento Cds
DM270
Parole chiave
terminology,machine translation,nmt,patents,wipo
Data di discussione della Tesi
18 Marzo 2024
URI
Gestione del documento: