Implementing large language model-based machine translation in small and medium-sized enterprises

Esteban Muñoz, Álvaro (2024) Implementing large language model-based machine translation in small and medium-sized enterprises. [Laurea magistrale], Università di Bologna, Corso di Studio in Artificial intelligence [LM-DM270]

Salva citazione

Documenti full-text disponibili:

Documento PDF (Thesis)
Disponibile con Licenza: Salvo eventuali più ampie autorizzazioni dell'autore, la tesi può essere liberamente consultata e può essere effettuato il salvataggio e la stampa di una copia per fini strettamente personali di studio, di ricerca e di insegnamento, con espresso divieto di qualunque utilizzo direttamente o indirettamente commerciale. Ogni altro diritto sul materiale è riservato
Download (856kB)

Abstract

With the release of Large Language Models (LLMs), namely the GPT models, many companies have integrated AI-based technologies to automate natural language tasks like summarization, question-answering, and translation. However, Small and Medium-sized Enterprises (SMEs) face a significant challenge in leveraging these advancements due to limited resources. Unlike large corporations (e.g., Google, Meta, or Amazon), SMEs often lack not only the computational power and financial capacity to train LLMs from scratch but also the vast amounts of data that they require for training, forcing them to rely on external models or services. This work addresses the problem of implementing a machine translation (MT) system tailored for an SME, Medhiartis s.r.l., with limited resources. Our approach involved fine-tuning pre-existing LLMs using the company's proprietary data to create customized translation models. We systematically evaluated these models’ performance and developed an API to integrate them into a functional MT pipeline. The API was deployed in two applications: a plugin for a Computer-Assisted Translation (CAT) tool and a web-based translation interface, both designed to streamline translation tasks for the company. This study demonstrates how SMEs can effectively adapt LLMs to their specific needs, providing a practical solution for high-quality machine translation in resource-constrained settings.

Abstract

Tipologia del documento

Tesi di laurea (Laurea magistrale)

Autore della tesi

Esteban Muñoz, Álvaro

Relatore della tesi

Torroni, Paolo

Correlatore della tesi

Pappacoda, Gianmarco

Scuola

Ingegneria e Architettura

Corso di studio