Sosto, Martina
(2023)
QueerBench: Quantifying Discrimination in Language Models towards Queer Identities.
[Laurea magistrale], Università di Bologna, Corso di Studio in
Informatica [LM-DM270]
Documenti full-text disponibili:
|
Documento PDF (Thesis)
Disponibile con Licenza: Salvo eventuali più ampie autorizzazioni dell'autore, la tesi può essere liberamente consultata e può essere effettuato il salvataggio e la stampa di una copia per fini strettamente personali di studio, di ricerca e di insegnamento, con espresso divieto di qualunque utilizzo direttamente o indirettamente commerciale. Ogni altro diritto sul materiale è riservato
Download (2MB)
|
Abstract
This thesis explores the evolving landscape of Natural Language Processing (NLP) and its intersection with societal biases, focusing on the LGBTQIA+ community. With the rise of computers in language comprehension, interpretation, and generation, NLP has become integral to various applications, posing challenges related to bias and stereotype perpetuation. As technology advances, there is a parallel emphasis on fostering inclusivity, particularly in digital spaces critical for the safety of LGBTQIA+ individuals.
Acknowledging the transformative influence of language on identity, this research underscores NLP's role in countering hate speech and bias online. Despite existing studies on sexism and misogyny, issues like homophobia and transphobia remain underexplored, often adopting binary perspectives. This behaviour not only marginalizes gender-diverse individuals but also perpetuates harmful behaviors.
The primary focus of this study is to assess the potential harm caused by sentence completions generated by large language models (LLMs) concerning LGBTQIA+ individuals. Employing a template-based approach, the investigation centres on the Masked Language Modelling (MLM) task and categorizes subjects into queer and non-queer terms, as well as neo-pronouns, neutral pronouns, and binary pronouns. The analysis reveals similarities in the assessment of pronouns by LLMs, with harmfulness rates around 6.1% for binary pronouns and approximately 5.4% and 4.9% for neo- and neutral pronouns. Sentences with queer terms (words that refer to a queer identity) as subjects peak at 16.4% harmfulness, surpassing non-queer subjects by 7.4%. This research contributes valuable insights into mitigating harm in language model outputs and promoting equitable language processing for the queer community.
Abstract
This thesis explores the evolving landscape of Natural Language Processing (NLP) and its intersection with societal biases, focusing on the LGBTQIA+ community. With the rise of computers in language comprehension, interpretation, and generation, NLP has become integral to various applications, posing challenges related to bias and stereotype perpetuation. As technology advances, there is a parallel emphasis on fostering inclusivity, particularly in digital spaces critical for the safety of LGBTQIA+ individuals.
Acknowledging the transformative influence of language on identity, this research underscores NLP's role in countering hate speech and bias online. Despite existing studies on sexism and misogyny, issues like homophobia and transphobia remain underexplored, often adopting binary perspectives. This behaviour not only marginalizes gender-diverse individuals but also perpetuates harmful behaviors.
The primary focus of this study is to assess the potential harm caused by sentence completions generated by large language models (LLMs) concerning LGBTQIA+ individuals. Employing a template-based approach, the investigation centres on the Masked Language Modelling (MLM) task and categorizes subjects into queer and non-queer terms, as well as neo-pronouns, neutral pronouns, and binary pronouns. The analysis reveals similarities in the assessment of pronouns by LLMs, with harmfulness rates around 6.1% for binary pronouns and approximately 5.4% and 4.9% for neo- and neutral pronouns. Sentences with queer terms (words that refer to a queer identity) as subjects peak at 16.4% harmfulness, surpassing non-queer subjects by 7.4%. This research contributes valuable insights into mitigating harm in language model outputs and promoting equitable language processing for the queer community.
Tipologia del documento
Tesi di laurea
(Laurea magistrale)
Autore della tesi
Sosto, Martina
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Indirizzo
CURRICULUM A: TECNICHE DEL SOFTWARE
Ordinamento Cds
DM270
Parole chiave
Language Models,Queer,LGBT,LGBTQIA+,NLP,LM,Natural Language Processing,Hate Speech,Harmfulness
Data di discussione della Tesi
14 Dicembre 2023
URI
Altri metadati
Tipologia del documento
Tesi di laurea
(NON SPECIFICATO)
Autore della tesi
Sosto, Martina
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Indirizzo
CURRICULUM A: TECNICHE DEL SOFTWARE
Ordinamento Cds
DM270
Parole chiave
Language Models,Queer,LGBT,LGBTQIA+,NLP,LM,Natural Language Processing,Hate Speech,Harmfulness
Data di discussione della Tesi
14 Dicembre 2023
URI
Statistica sui download
Gestione del documento: