Fratus, Marta
 
(2023)
Working with big data from ingestion to prediction:
an experimental approach on air pollution ARPA data.
[Laurea magistrale], Università di Bologna, Corso di Studio in 
Matematica [LM-DM270], Documento full-text non disponibile
  
 
  
  
        
        
	
  
  
  
  
  
  
  
    
      Il full-text non è disponibile per scelta dell'autore.
      
        (
Contatta l'autore)
      
    
  
    
  
  
    
      Abstract
      This thesis initiates a comprehensive exploration of environmental data provided by ARPAE, employing a structured approach to data processing, analytics, and predictive modeling. The
primary objective is to clarify the complexities of environmental quality, spanning from data collection and cleansing to in-depth analysis and future forecasting. The initial chapter provides a detailed overview of the Extract, Transform, Load (ETL) processes, explaining the theoretical framework behind these processes, the issues encountered and the solutions applied. Talend Open Studio for Data Integration is introduced along with its components, showcasing their role in transforming raw ARPAE data into a structured and usable format. Additionally, DBeaver is presented as a database management tool facilitating data organization. Then, all the tables created on the database and the jobs are shown in detail.
In Chapter 2 the focus shifts to data analytics, where Power BI takes center stage. The aim of this chapter is to visualize and analyze the data collected in the previous one. The creation
of informative dashboards becomes pivotal, visually representing trends in key environmental parameters. We carefully examine the data, paying close attention to whether the environmental
data complies with legal standards.
The final chapter elevates the exploration to predictive analysis, introducing linear regression, ETS, and ARIMA models as tools for forecasting future environmental data based on historical
information. These models are applied specifically to critical parameters such as PM10, PM2.5 and O3, aiming to predict their values for the year 2023. Subsequently, we compare these predictions with the available partial data from 2023. By integrating technical methodologies, analytical insights, and predictive capabilities, this thesis aims to contribute to a rich and detailed understanding of both historical trends and potential future trajectories in environmental quality.
     
    
      Abstract
      This thesis initiates a comprehensive exploration of environmental data provided by ARPAE, employing a structured approach to data processing, analytics, and predictive modeling. The
primary objective is to clarify the complexities of environmental quality, spanning from data collection and cleansing to in-depth analysis and future forecasting. The initial chapter provides a detailed overview of the Extract, Transform, Load (ETL) processes, explaining the theoretical framework behind these processes, the issues encountered and the solutions applied. Talend Open Studio for Data Integration is introduced along with its components, showcasing their role in transforming raw ARPAE data into a structured and usable format. Additionally, DBeaver is presented as a database management tool facilitating data organization. Then, all the tables created on the database and the jobs are shown in detail.
In Chapter 2 the focus shifts to data analytics, where Power BI takes center stage. The aim of this chapter is to visualize and analyze the data collected in the previous one. The creation
of informative dashboards becomes pivotal, visually representing trends in key environmental parameters. We carefully examine the data, paying close attention to whether the environmental
data complies with legal standards.
The final chapter elevates the exploration to predictive analysis, introducing linear regression, ETS, and ARIMA models as tools for forecasting future environmental data based on historical
information. These models are applied specifically to critical parameters such as PM10, PM2.5 and O3, aiming to predict their values for the year 2023. Subsequently, we compare these predictions with the available partial data from 2023. By integrating technical methodologies, analytical insights, and predictive capabilities, this thesis aims to contribute to a rich and detailed understanding of both historical trends and potential future trajectories in environmental quality.
     
  
  
    
    
      Tipologia del documento
      Tesi di laurea
(Laurea magistrale)
      
      
      
      
        
      
        
          Autore della tesi
          Fratus, Marta
          
        
      
        
          Relatore della tesi
          
          
        
      
        
      
        
          Scuola
          
          
        
      
        
          Corso di studio
          
          
        
      
        
          Indirizzo
          CURRICULUM ADVANCED MATHEMATICS FOR APPLICATIONS
          
        
      
        
      
        
          Ordinamento Cds
          DM270
          
        
      
        
          Parole chiave
          pollution,data integration,data analytics,data visualization,predictive models,ETL,ARPAE
          
        
      
        
          Data di discussione della Tesi
          22 Dicembre 2023
          
        
      
      URI
      
      
     
   
  
    Altri metadati
    
      Tipologia del documento
      Tesi di laurea
(NON SPECIFICATO)
      
      
      
      
        
      
        
          Autore della tesi
          Fratus, Marta
          
        
      
        
          Relatore della tesi
          
          
        
      
        
      
        
          Scuola
          
          
        
      
        
          Corso di studio
          
          
        
      
        
          Indirizzo
          CURRICULUM ADVANCED MATHEMATICS FOR APPLICATIONS
          
        
      
        
      
        
          Ordinamento Cds
          DM270
          
        
      
        
          Parole chiave
          pollution,data integration,data analytics,data visualization,predictive models,ETL,ARPAE
          
        
      
        
          Data di discussione della Tesi
          22 Dicembre 2023
          
        
      
      URI
      
      
     
   
  
  
  
  
  
  
    
      Gestione del documento: 
      
        