Implementing large language model-based machine translation in small and medium-sized enterprises

Esteban Muñoz, Álvaro (2024) Implementing large language model-based machine translation in small and medium-sized enterprises. [Laurea magistrale], Università di Bologna, Corso di Studio in Artificial intelligence [LM-DM270]
Documenti full-text disponibili:
[thumbnail of Thesis] Documento PDF (Thesis)
Disponibile con Licenza: Salvo eventuali più ampie autorizzazioni dell'autore, la tesi può essere liberamente consultata e può essere effettuato il salvataggio e la stampa di una copia per fini strettamente personali di studio, di ricerca e di insegnamento, con espresso divieto di qualunque utilizzo direttamente o indirettamente commerciale. Ogni altro diritto sul materiale è riservato

Download (856kB)

Abstract

With the release of Large Language Models (LLMs), namely the GPT models, many companies have integrated AI-based technologies to automate natural language tasks like summarization, question-answering, and translation. However, Small and Medium-sized Enterprises (SMEs) face a significant challenge in leveraging these advancements due to limited resources. Unlike large corporations (e.g., Google, Meta, or Amazon), SMEs often lack not only the computational power and financial capacity to train LLMs from scratch but also the vast amounts of data that they require for training, forcing them to rely on external models or services. This work addresses the problem of implementing a machine translation (MT) system tailored for an SME, Medhiartis s.r.l., with limited resources. Our approach involved fine-tuning pre-existing LLMs using the company's proprietary data to create customized translation models. We systematically evaluated these models’ performance and developed an API to integrate them into a functional MT pipeline. The API was deployed in two applications: a plugin for a Computer-Assisted Translation (CAT) tool and a web-based translation interface, both designed to streamline translation tasks for the company. This study demonstrates how SMEs can effectively adapt LLMs to their specific needs, providing a practical solution for high-quality machine translation in resource-constrained settings.

Abstract
Tipologia del documento
Tesi di laurea (Laurea magistrale)
Autore della tesi
Esteban Muñoz, Álvaro
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Natural Language Processing,Machine Translation,Large Language Models,Small and Medium-sized Enterprises,Low-Resource Languages
Data di discussione della Tesi
8 Ottobre 2024
URI

Altri metadati

Statistica sui download

Gestione del documento: Visualizza il documento

^