Generation of proprietary code: from the data extraction to the model finetuning and integration in multi agent system

Periti, Alex (2024) Generation of proprietary code: from the data extraction to the model finetuning and integration in multi agent system. [Laurea magistrale], Università di Bologna, Corso di Studio in Artificial intelligence [LM-DM270], Documento ad accesso riservato.
Documenti full-text disponibili:
[thumbnail of Thesis] Documento PDF (Thesis)
Full-text accessibile solo agli utenti istituzionali dell'Ateneo
Disponibile con Licenza: Salvo eventuali più ampie autorizzazioni dell'autore, la tesi può essere liberamente consultata e può essere effettuato il salvataggio e la stampa di una copia per fini strettamente personali di studio, di ricerca e di insegnamento, con espresso divieto di qualunque utilizzo direttamente o indirettamente commerciale. Ogni altro diritto sul materiale è riservato

Download (945kB) | Contatta l'autore

Abstract

Nowadays with the proliferation of natural language processing (NLP) models, the potential for automating code generation has garnered significant attention. However, tailoring these models to domain-specific requirements, especially in the context of proprietary software development, remains a complex and relatively unexplored area. This thesis begins with the motivations behind the project, highlighting the challenges posed and the potential benefits derived from an efficient code generation process. The core of the thesis focuses on the methodology employed for fine-tuning a state-of-the-art large language model, such as GPT-3.5, for the task at hand. The process involves the creation of a specialized dataset made up of java code. The methodology also addresses considerations such as the problem and limitations encountered during the development, and the evaluation metrics employed to measure the performance of the fine-tuned model. To validate the effectiveness of the fine-tuned model, a series of experiments are conducted, comparing the performance of the model against baseline models and traditional code generation techniques. Tailored evaluation metrics are employed to assess the model’s efficacy in generating high-quality proprietary Java code. The findings of the research contribute valuable insights to the knowledge in the intersection of NLP and software development. The thesis concludes with a discussion on the practical implications of the research, potential applications in real-world scenarios, and future research in refining and extending the capabilities of LLM for proprietary code generation.

Abstract
Tipologia del documento
Tesi di laurea (Laurea magistrale)
Autore della tesi
Periti, Alex
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Large Language Model,Code Generation,GPT 3.5 turbo finetuning,Multi-Agent System,Llama2
Data di discussione della Tesi
19 Marzo 2024
URI

Altri metadati

Statistica sui download

Gestione del documento: Visualizza il documento

^