Working with big data from ingestion to prediction: an experimental approach on air pollution ARPA data

Fratus, Marta (2023) Working with big data from ingestion to prediction: an experimental approach on air pollution ARPA data. [Laurea magistrale], Università di Bologna, Corso di Studio in Matematica [LM-DM270], Documento full-text non disponibile
Il full-text non è disponibile per scelta dell'autore. (Contatta l'autore)

Abstract

This thesis initiates a comprehensive exploration of environmental data provided by ARPAE, employing a structured approach to data processing, analytics, and predictive modeling. The primary objective is to clarify the complexities of environmental quality, spanning from data collection and cleansing to in-depth analysis and future forecasting. The initial chapter provides a detailed overview of the Extract, Transform, Load (ETL) processes, explaining the theoretical framework behind these processes, the issues encountered and the solutions applied. Talend Open Studio for Data Integration is introduced along with its components, showcasing their role in transforming raw ARPAE data into a structured and usable format. Additionally, DBeaver is presented as a database management tool facilitating data organization. Then, all the tables created on the database and the jobs are shown in detail. In Chapter 2 the focus shifts to data analytics, where Power BI takes center stage. The aim of this chapter is to visualize and analyze the data collected in the previous one. The creation of informative dashboards becomes pivotal, visually representing trends in key environmental parameters. We carefully examine the data, paying close attention to whether the environmental data complies with legal standards. The final chapter elevates the exploration to predictive analysis, introducing linear regression, ETS, and ARIMA models as tools for forecasting future environmental data based on historical information. These models are applied specifically to critical parameters such as PM10, PM2.5 and O3, aiming to predict their values for the year 2023. Subsequently, we compare these predictions with the available partial data from 2023. By integrating technical methodologies, analytical insights, and predictive capabilities, this thesis aims to contribute to a rich and detailed understanding of both historical trends and potential future trajectories in environmental quality.

Abstract
Tipologia del documento
Tesi di laurea (Laurea magistrale)
Autore della tesi
Fratus, Marta
Relatore della tesi
Scuola
Corso di studio
Indirizzo
CURRICULUM ADVANCED MATHEMATICS FOR APPLICATIONS
Ordinamento Cds
DM270
Parole chiave
pollution,data integration,data analytics,data visualization,predictive models,ETL,ARPAE
Data di discussione della Tesi
22 Dicembre 2023
URI

Altri metadati

Gestione del documento: Visualizza il documento

^