Huang, Xuanqiang
(2024)
Theory of Mind in Large Language Models.
[Laurea], Università di Bologna, Corso di Studio in
Informatica [L-DM270]
Documenti full-text disponibili:
Abstract
Theory of Mind (ToM) is essential in human interactions and has significant implications across various fields. This thesis investigates the level of ToM of five modern Large Language Models (LLM) and proposes a framework to measure the complexity of ToM tasks. We begin with a brief historical overview, tracing ToM's evolution from philosophy and psychology to its influence in computer science. We then introduce a complexity measure that distinguishes between necessary and spurious states, affecting specific ToM problems' difficulty. Inspired by this framework, we harness ideas from World Models to develop Discrete World Models (DWMs): descriptions of states created by the models themselves. We conclude with an analysis of the effectiveness of the techniques proposed, improving up to 9.99% in the best-case scenario, and an investigation of the consistency of the proposed theoretical measure with public datasets.
Finally, we discuss future benchmarking directions for ToM abilities in LLMs and provide a memorization analysis of the datasets used.
Abstract
Theory of Mind (ToM) is essential in human interactions and has significant implications across various fields. This thesis investigates the level of ToM of five modern Large Language Models (LLM) and proposes a framework to measure the complexity of ToM tasks. We begin with a brief historical overview, tracing ToM's evolution from philosophy and psychology to its influence in computer science. We then introduce a complexity measure that distinguishes between necessary and spurious states, affecting specific ToM problems' difficulty. Inspired by this framework, we harness ideas from World Models to develop Discrete World Models (DWMs): descriptions of states created by the models themselves. We conclude with an analysis of the effectiveness of the techniques proposed, improving up to 9.99% in the best-case scenario, and an investigation of the consistency of the proposed theoretical measure with public datasets.
Finally, we discuss future benchmarking directions for ToM abilities in LLMs and provide a memorization analysis of the datasets used.
Tipologia del documento
Tesi di laurea
(Laurea)
Autore della tesi
Huang, Xuanqiang
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Large Language Models,Throy of Mind,Artificial Intelligence,Prompt Engineering,World Models,Commonsense Reasoning
Data di discussione della Tesi
10 Luglio 2024
URI
Altri metadati
Tipologia del documento
Tesi di laurea
(NON SPECIFICATO)
Autore della tesi
Huang, Xuanqiang
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Large Language Models,Throy of Mind,Artificial Intelligence,Prompt Engineering,World Models,Commonsense Reasoning
Data di discussione della Tesi
10 Luglio 2024
URI
Statistica sui download
Gestione del documento: