Mantineo, Giuseppe
(2024)
A Safety Framework For Large
Language Models.
[Laurea magistrale], Università di Bologna, Corso di Studio in
Artificial intelligence [LM-DM270], Documento full-text non disponibile
Il full-text non è disponibile per scelta dell'autore.
(
Contatta l'autore)
Abstract
Artificial Intelligence is revolutionizing the world, with Large Language Models (LLMs) leading the way by generating human-like text and addressing complex visual tasks. However, these advancements come with critical security risks. This thesis provides a comprehensive analysis of LLM vulnerabilities, focusing on threats like jailbreaks, prompt injections, and data poisoning. Beyond technical issues, it addresses ethical concerns, such as LLMs’ potential to reinforce biases and spread misinformation. To counter these risks, the thesis proposes a robust safety framework incorporating model filtering, behavioral alignment, and adversarial training. The goal is to ensure the responsible and secure deployment of LLMs, balancing innovation with safety.
Abstract
Artificial Intelligence is revolutionizing the world, with Large Language Models (LLMs) leading the way by generating human-like text and addressing complex visual tasks. However, these advancements come with critical security risks. This thesis provides a comprehensive analysis of LLM vulnerabilities, focusing on threats like jailbreaks, prompt injections, and data poisoning. Beyond technical issues, it addresses ethical concerns, such as LLMs’ potential to reinforce biases and spread misinformation. To counter these risks, the thesis proposes a robust safety framework incorporating model filtering, behavioral alignment, and adversarial training. The goal is to ensure the responsible and secure deployment of LLMs, balancing innovation with safety.
Tipologia del documento
Tesi di laurea
(Laurea magistrale)
Autore della tesi
Mantineo, Giuseppe
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Large Language Models,AI Safety,AI alignment,Vision Language Model,Transformers,AI Security,AI Ethics
Data di discussione della Tesi
8 Ottobre 2024
URI
Altri metadati
Tipologia del documento
Tesi di laurea
(NON SPECIFICATO)
Autore della tesi
Mantineo, Giuseppe
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Large Language Models,AI Safety,AI alignment,Vision Language Model,Transformers,AI Security,AI Ethics
Data di discussione della Tesi
8 Ottobre 2024
URI
Gestione del documento: