Crastolla, Claudia
(2026)
New Machine Learning Algorithms to Study the Large-Scale Structure of the Universe.
[Laurea magistrale], Università di Bologna, Corso di Studio in
Astrophysics and cosmology [LM-DM270]
Documenti full-text disponibili:
Abstract
The primary objective of this thesis is to propose and evaluate artificial intelligence (AI)-enhanced methods for analyzing cosmological data. Specifically, we investigate the integration of large language models (LLMs) into the cosmological workflow to assess their capacity for automating complex research pipelines. By employing a multi-agent framework, we task general-purpose LLMs with executing an end-to-end clustering analysis, including code generation, debugging, and result interpretation, for the computation of a two-point correlation function (2PCF) monopole and subsequent parameter inference from baryon acoustic oscillations (BAO). The first half of this work establishes the theoretical foundations of modern cosmology, the large-scale structure (LSS) of the Universe, and the fundamentals of agentic AI. The latter half details the technical implementation of the multi-agent pipeline and provides a comparative analysis between the autonomous agents and established cosmological benchmarks. The results demonstrate that while current LLMs can rapidly develop functional software, challenges regarding computational efficiency and physical rigor persist. Although 2PCF measurements and Markov Chain Monte Carlo (MCMC) constraints were successfully executed, the agents encountered performance bottlenecks and produced results that were not always perfectly centered relative to ground-truth benchmarks. Ultimately, this work suggests that while LLMs serve as powerful computational engines, they require a human expert to provide the necessary physical intuition and theoretical validation to ensure scientific accuracy in cosmological research.
Abstract
The primary objective of this thesis is to propose and evaluate artificial intelligence (AI)-enhanced methods for analyzing cosmological data. Specifically, we investigate the integration of large language models (LLMs) into the cosmological workflow to assess their capacity for automating complex research pipelines. By employing a multi-agent framework, we task general-purpose LLMs with executing an end-to-end clustering analysis, including code generation, debugging, and result interpretation, for the computation of a two-point correlation function (2PCF) monopole and subsequent parameter inference from baryon acoustic oscillations (BAO). The first half of this work establishes the theoretical foundations of modern cosmology, the large-scale structure (LSS) of the Universe, and the fundamentals of agentic AI. The latter half details the technical implementation of the multi-agent pipeline and provides a comparative analysis between the autonomous agents and established cosmological benchmarks. The results demonstrate that while current LLMs can rapidly develop functional software, challenges regarding computational efficiency and physical rigor persist. Although 2PCF measurements and Markov Chain Monte Carlo (MCMC) constraints were successfully executed, the agents encountered performance bottlenecks and produced results that were not always perfectly centered relative to ground-truth benchmarks. Ultimately, this work suggests that while LLMs serve as powerful computational engines, they require a human expert to provide the necessary physical intuition and theoretical validation to ensure scientific accuracy in cosmological research.
Tipologia del documento
Tesi di laurea
(Laurea magistrale)
Autore della tesi
Crastolla, Claudia
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
large-scale structure parameter inference machine learning large language models AI agents
Data di discussione della Tesi
27 Marzo 2026
URI
Altri metadati
Tipologia del documento
Tesi di laurea
(NON SPECIFICATO)
Autore della tesi
Crastolla, Claudia
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
large-scale structure parameter inference machine learning large language models AI agents
Data di discussione della Tesi
27 Marzo 2026
URI
Statistica sui download
Gestione del documento: