Estrazione terminologica automatica: sistemi a confronto

Ferri, Veronica (2015) Estrazione terminologica automatica: sistemi a confronto. [Laurea magistrale], Università di Bologna, Corso di Studio in Traduzione specializzata [LM-DM270] - Forli', Documento ad accesso riservato.
Documenti full-text disponibili:
[img] Documento PDF
Full-text accessibile solo agli utenti istituzionali dell'Ateneo

Download (2MB) | Contatta l'autore


In any terminological study, candidate term extraction is a very time-consuming task. Corpus analysis tools have automatized some processes allowing the detection of relevant data within the texts, facilitating term candidate selection as well. Nevertheless, these tools are (normally) not specific for terminology research; therefore, the units which are automatically extracted need manual evaluation. Over the last few years some software products have been specifically developed for automatic term extraction. They are based on corpus analysis, but use linguistic and statistical information to filter data more precisely. As a result, the time needed for manual evaluation is reduced. In this framework, we tried to understand if and how these new tools can really be an advantage. In order to develop our project, we simulated a terminology study: we chose a domain (i.e. legal framework for medicinal products for human use) and compiled a corpus from which we extracted terms and phraseologisms using AntConc, a corpus analysis tool. Afterwards, we compared our list with the lists extracted automatically from three different tools (TermoStat Web, TaaS e Sketch Engine) in order to evaluate their performance. In the first chapter we describe some principles relating to terminology and phraseology in language for special purposes and show the advantages offered by corpus linguistics. In the second chapter we illustrate some of the main concepts of the domain selected, as well as some of the main features of legal texts. In the third chapter we describe automatic term extraction and the main criteria to evaluate it; moreover, we introduce the term-extraction tools used for this project. In the fourth chapter we describe our research method and, in the fifth chapter, we show our results and draw some preliminary conclusions on the performance and usefulness of term-extraction tools.

Tipologia del documento
Tesi di laurea (Laurea magistrale)
Autore della tesi
Ferri, Veronica
Relatore della tesi
Corso di studio
Ordinamento Cds
Parole chiave
corpora, terminologia, termini, estrazione automatica
Data di discussione della Tesi
12 Marzo 2015

Altri metadati

Statistica sui download

Gestione del documento: Visualizza il documento