A Safety Framework For Large Language Models

Mantineo, Giuseppe (2024) A Safety Framework For Large Language Models. [Laurea magistrale], Università di Bologna, Corso di Studio in Artificial intelligence [LM-DM270], Documento full-text non disponibile

Salva citazione

Il full-text non è disponibile per scelta dell'autore. (Contatta l'autore)

Abstract

Artificial Intelligence is revolutionizing the world, with Large Language Models (LLMs) leading the way by generating human-like text and addressing complex visual tasks. However, these advancements come with critical security risks. This thesis provides a comprehensive analysis of LLM vulnerabilities, focusing on threats like jailbreaks, prompt injections, and data poisoning. Beyond technical issues, it addresses ethical concerns, such as LLMs’ potential to reinforce biases and spread misinformation. To counter these risks, the thesis proposes a robust safety framework incorporating model filtering, behavioral alignment, and adversarial training. The goal is to ensure the responsible and secure deployment of LLMs, balancing innovation with safety.

Abstract

Tipologia del documento

Tesi di laurea (Laurea magistrale)

Autore della tesi

Mantineo, Giuseppe

Relatore della tesi

Salti, Samuele

Correlatore della tesi

Casciaro, Luigi ; Mega, Gianluca

Scuola

Ingegneria e Architettura

Corso di studio