Brajevic, Enis
(2025)
Towards the Watermarking of Large Language Models: A Survey.
[Laurea], Università di Bologna, Corso di Studio in
Informatica [L-DM270]
Documenti full-text disponibili:
![[thumbnail of Thesis]](https://amslaurea.unibo.it/style/images/fileicons/application_pdf.png) |
Documento PDF (Thesis)
Disponibile con Licenza: Salvo eventuali più ampie autorizzazioni dell'autore, la tesi può essere liberamente consultata e può essere effettuato il salvataggio e la stampa di una copia per fini strettamente personali di studio, di ricerca e di insegnamento, con espresso divieto di qualunque utilizzo direttamente o indirettamente commerciale. Ogni altro diritto sul materiale è riservato
Download (821kB)
|
Abstract
Intellectual property theft through the unlawful sharing of texts is a pressing issue, further intensified by the rapid development of Large Language Models (LLMs). These models, designed for Natural Language Processing tasks, have reached a level
of performance and popularity that makes them very vulnerable to misuse, such as the generation of fake text or false authorship attribution of the generated content. Text watermarking is the best tool to safeguard LLMs from such misuse. Because
Large Language Models and text watermarking exhibit such complementarity, in this comprehensive survey we describe the state of the art of these two fields and how they can be combined. Our contribution can be summarized in the following four main aspects: (1) an overview of the fundamental characteristics of LLMs and their state of the art; (2) an overview and analysis of the main types of text watermarking methods that target existing text, and how they compare to each other; (3) an overview and analysis of the main LLM watermarking methods; (4)
identification of underexplored areas through the formulation of targeted research questions, which are then addressed after the comprehensive study on LLMs and text watermarking.
Abstract
Intellectual property theft through the unlawful sharing of texts is a pressing issue, further intensified by the rapid development of Large Language Models (LLMs). These models, designed for Natural Language Processing tasks, have reached a level
of performance and popularity that makes them very vulnerable to misuse, such as the generation of fake text or false authorship attribution of the generated content. Text watermarking is the best tool to safeguard LLMs from such misuse. Because
Large Language Models and text watermarking exhibit such complementarity, in this comprehensive survey we describe the state of the art of these two fields and how they can be combined. Our contribution can be summarized in the following four main aspects: (1) an overview of the fundamental characteristics of LLMs and their state of the art; (2) an overview and analysis of the main types of text watermarking methods that target existing text, and how they compare to each other; (3) an overview and analysis of the main LLM watermarking methods; (4)
identification of underexplored areas through the formulation of targeted research questions, which are then addressed after the comprehensive study on LLMs and text watermarking.
Tipologia del documento
Tesi di laurea
(Laurea)
Autore della tesi
Brajevic, Enis
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Large Language Models,Text watermarking,Intellectual property,Copyright Protection
Data di discussione della Tesi
15 Luglio 2025
URI
Altri metadati
Tipologia del documento
Tesi di laurea
(NON SPECIFICATO)
Autore della tesi
Brajevic, Enis
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Large Language Models,Text watermarking,Intellectual property,Copyright Protection
Data di discussione della Tesi
15 Luglio 2025
URI
Statistica sui download
Gestione del documento: