Paladino, Mattia
(2022)
Machine learning "as a service" for high energy physics (MLaaS4HEP): evolution of a framework for ML-based physics analyses.
[Laurea magistrale], Università di Bologna, Corso di Studio in
Physics [LM-DM270]
Documenti full-text disponibili:
Abstract
The scientific success of the LHC experiments at CERN highly depends on the availability of computing resources which efficiently store, process, and analyse the amount of data collected every year. This is ensured by the Worldwide LHC Computing Grid infrastructure that connect computing centres distributed all over the world with high performance network.
LHC has an ambitious experimental program for the coming years, which includes large investments and improvements both for the hardware of the detectors and for the software and computing systems, in order to deal with the huge increase in the event rate expected from the High Luminosity LHC (HL-LHC) phase and consequently with the huge amount of data that will be produced.
Since few years the role of Artificial Intelligence has become relevant in the High Energy Physics (HEP) world. Machine Learning (ML) and Deep Learning algorithms have been successfully used in many areas of HEP, like online and offline reconstruction programs, detector simulation, object reconstruction, identification, Monte Carlo generation, and surely they will be crucial in the HL-LHC phase.
This thesis aims at contributing to a CMS R&D project, regarding a ML "as a Service" solution for HEP needs (MLaaS4HEP). It consists in a data-service able to perform an entire ML pipeline (in terms of reading data, processing data, training ML models, serving predictions) in a completely model-agnostic fashion, directly using ROOT files of arbitrary size from local or distributed data sources.
This framework has been updated adding new features in the data preprocessing phase, allowing more flexibility to the user. Since the MLaaS4HEP framework is experiment agnostic, the ATLAS Higgs Boson ML challenge has been chosen as physics use case, with the aim to test MLaaS4HEP and the contribution done with this work.
Abstract
The scientific success of the LHC experiments at CERN highly depends on the availability of computing resources which efficiently store, process, and analyse the amount of data collected every year. This is ensured by the Worldwide LHC Computing Grid infrastructure that connect computing centres distributed all over the world with high performance network.
LHC has an ambitious experimental program for the coming years, which includes large investments and improvements both for the hardware of the detectors and for the software and computing systems, in order to deal with the huge increase in the event rate expected from the High Luminosity LHC (HL-LHC) phase and consequently with the huge amount of data that will be produced.
Since few years the role of Artificial Intelligence has become relevant in the High Energy Physics (HEP) world. Machine Learning (ML) and Deep Learning algorithms have been successfully used in many areas of HEP, like online and offline reconstruction programs, detector simulation, object reconstruction, identification, Monte Carlo generation, and surely they will be crucial in the HL-LHC phase.
This thesis aims at contributing to a CMS R&D project, regarding a ML "as a Service" solution for HEP needs (MLaaS4HEP). It consists in a data-service able to perform an entire ML pipeline (in terms of reading data, processing data, training ML models, serving predictions) in a completely model-agnostic fashion, directly using ROOT files of arbitrary size from local or distributed data sources.
This framework has been updated adding new features in the data preprocessing phase, allowing more flexibility to the user. Since the MLaaS4HEP framework is experiment agnostic, the ATLAS Higgs Boson ML challenge has been chosen as physics use case, with the aim to test MLaaS4HEP and the contribution done with this work.
Tipologia del documento
Tesi di laurea
(Laurea magistrale)
Autore della tesi
Paladino, Mattia
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Indirizzo
NUCLEAR AND SUBNUCLEAR PHYSICS
Ordinamento Cds
DM270
Parole chiave
Machine Learning,MLaaS4HEP,HEP,CMS,Neural Network,Computing,Deep Learning,Machine Learning as a Service,Cloud,CERN
Data di discussione della Tesi
31 Maggio 2022
URI
Altri metadati
Tipologia del documento
Tesi di laurea
(NON SPECIFICATO)
Autore della tesi
Paladino, Mattia
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Indirizzo
NUCLEAR AND SUBNUCLEAR PHYSICS
Ordinamento Cds
DM270
Parole chiave
Machine Learning,MLaaS4HEP,HEP,CMS,Neural Network,Computing,Deep Learning,Machine Learning as a Service,Cloud,CERN
Data di discussione della Tesi
31 Maggio 2022
URI
Statistica sui download
Gestione del documento: