Documenti full-text disponibili:
Abstract
Nowadays, IoT and social networks are the main sources of big data, they generate a massive amount of assets and companies have to develop data-driven strategies to exploit the value of information that’s behind data. The shape of data sources is typically heterogeneous, since data can be generated from different sources distributed all around the world. The sparsity and the heterogeneous shape of data make much more difficult the process of data wrangling and knowledge discovering, and these are the reasons why data-driven companies must use data integration techniques to address this complexity.
The DTIM research group at Universitat Politècnica de Catalunya (UPC) upon I have been working with is interested in such thematic and in 2015 they developed Graph-driven Federated Data Management (GFDM), that proposes in a very intuitive way a graph-based data integration architecture. What we would like to do in this project is to extend GFDM, to support automatic data aggregation following the OLAP data processing grounded on multidimensional modeling, as data warehouses do, but on top of graph data. This idea will be carried out by developing a framework able to perform OLAP-like queries over GFDM, mainly focusing on the well-known Roll-Up operation. In this thesis we have developed a method that given a query is able to align data coming from different data sources and sitting at different granularities level that participate in the same conceptual aggregation hierarchy. Our method is able to identify implicit aggregations that would allow to align data from different data sources and integrate them seeminglessly at the correct granularity level. After an accurate design and implementation phase we can finally consider our goal accomplished, developing with success the Implicit Roll-Up algorithm satisfying our requirements.
Abstract
Nowadays, IoT and social networks are the main sources of big data, they generate a massive amount of assets and companies have to develop data-driven strategies to exploit the value of information that’s behind data. The shape of data sources is typically heterogeneous, since data can be generated from different sources distributed all around the world. The sparsity and the heterogeneous shape of data make much more difficult the process of data wrangling and knowledge discovering, and these are the reasons why data-driven companies must use data integration techniques to address this complexity.
The DTIM research group at Universitat Politècnica de Catalunya (UPC) upon I have been working with is interested in such thematic and in 2015 they developed Graph-driven Federated Data Management (GFDM), that proposes in a very intuitive way a graph-based data integration architecture. What we would like to do in this project is to extend GFDM, to support automatic data aggregation following the OLAP data processing grounded on multidimensional modeling, as data warehouses do, but on top of graph data. This idea will be carried out by developing a framework able to perform OLAP-like queries over GFDM, mainly focusing on the well-known Roll-Up operation. In this thesis we have developed a method that given a query is able to align data coming from different data sources and sitting at different granularities level that participate in the same conceptual aggregation hierarchy. Our method is able to identify implicit aggregations that would allow to align data from different data sources and integrate them seeminglessly at the correct granularity level. After an accurate design and implementation phase we can finally consider our goal accomplished, developing with success the Implicit Roll-Up algorithm satisfying our requirements.
Tipologia del documento
Tesi di laurea
(Laurea magistrale)
Autore della tesi
Pistocchi, Filippo
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
OLAP,Roll-Up,Graph,Data Integration
Data di discussione della Tesi
18 Marzo 2022
URI
Altri metadati
Tipologia del documento
Tesi di laurea
(NON SPECIFICATO)
Autore della tesi
Pistocchi, Filippo
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
OLAP,Roll-Up,Graph,Data Integration
Data di discussione della Tesi
18 Marzo 2022
URI
Statistica sui download
Gestione del documento: