par Nadal Francesch, Sergi ;Romero, Oscar;Abelló, Alberto;Vassiliadis, Panos P.V.;Vansummeren, Stijn
Référence (March 21-24, 2017: Venice, Italy, March), Proceedings of the Workshops of the EDBT/ICDT 2017 Joint Conference, CEUR-WS.org, Vol. 1810
Publication Publié, 2017-03
Publication dans des actes
Résumé : Big Data architectures allow to flexibly store and process heterogeneous data, from multiple sources, in its original format. The structure of those data, commonly supplied by means of REST APIs, is continuously evolving, forcing data analysts using it need to adapt their analytical processes after each release. This gets more challenging when aiming to perform an integrated or historical analysis of multiple sources. To cope with such complexity, in this paper we present the Big Data Integration ontology, the core construct for a data governance protocol that systematically annotates and integrates data from multiple sources in its original format. To cope with syntactic evolution in the sources, we present an algorithm that semi-automatically adapts the ontology upon new releases. A functional evaluation on real-world APIs is performed in order to validate our approach.