Open data integration model using a polystore system for large scale scientific data archives in astronomy
Abstract
Polystore systems have been recently proposed as a new data integration model to provide integrated access to heterogeneous data stores through a unified single query language. Recently, there is a growing interest in the database community to manage large scale unstructured data from multiple heterogeneous data stores. Special attention is focused on this problem due to growth in the size of data, the speed of increment of data, and the emergence of various data types in different scientific data archives. Moreover, astronomy as a scientific domain produces a huge amount of data that is stored in the data archives provided by NASA and its subsidiaries. The data type mostly consists of images, unstructured texts, and structured (relations, key-values). This paper articulates the problems of integrating multiple data stores to manage heterogeneous data and polystore architecture as a solution. A method of managing a local data store and communicating with a remote cloud data store with the help of a web-based query system is defined.