Skip to main content
No Access

Live linked data: synchronising semantic stores with commutative replicated data types

Published Online:pp 119-133https://doi.org/10.1504/IJMSO.2013.056605

Linked Data is currently interconnecting information on the web, creating the web of data. It allows data consumers to combine different data sets and perform powerful queries. However, this means either to copy data sets locally or perform distributed querying. Local copies have problems of freshness, distributed queries, of scalability and performance. Linked Data producers are currently going live by providing streams of data updates, opening a third way to query: synchronise and search. Each Linked Data node can follow update streams of others, creating a social network of live updates: the Live Linked Data (LLD). Unfortunately, synchronising data among autonomous participants raises issues of concurrency and consistency. In this paper, we propose SU-Set, a Commutative Replicated Data Type (CRDT) for RDF graph updated with SPARQL Update 1.1. We describe how a semantic store can use SU-Set to ensure eventual consistency in LLD, with a low overhead in time, space and communication.

Keywords

linked data, semantic web, CRDT, RDF, data replication, collaborative networks, eventual consistency, SPARQL 1.1

References

  • 1. Agrawal, P. , Silberstein, A. , Cooper, B.F. , Srivastava, U. , Ramakrishnan, R. (2009). ‘Asynchronous view maintenance for VLSD databases’. Proceedings of the 2009 ACM SIGMOD International Conference on Management of data (SIGMOD). 179-192 Google Scholar
  • 2. Aslan, K. , Skaf-Molli, H. , Molli, P. , Weiss, S. (2011). ‘C-set: a commutative replicated data type for semantic stores’. RED: 4th International Workshop on Resource Discovery – at 8th Extended, Semantic Web Conference (ESWC), 123-130 Google Scholar
  • 3. Berners-Lee, T. , Connolly, D. (2004). Delta: An Ontology for the Distribution of Differences between RDF Graphs. Available online at: http://www.w3.org/DesignIssues/Diff Google Scholar
  • 4. Berners-Lee, T. , Fielding, R. , Masinter, L. (2005). ‘Uniform resource identifier (URI): generic syntax’. Internet RFCs, RFC 3986, 2005. Available online at: http://www.rfceditor.org/rfc/rfc3986.txt Google Scholar
  • 5. Bizer, C. , Heath, T. , Berners-Lee, T. (2009a). ‘Linked data – the story so far’. International Journal of Semantic Web Information Systems. 5, 3, 1-22 Google Scholar
  • 6. Bizer, C. , Lehmann, J. , Kobilarov, G. , Auer, S. , Becker, C. , Cyganiak, R. , Hellmann, S. (2009). ‘Dbpedia – a crystallization point for the web of data’. Journal of Web Semantics. 7, 3, 154-165 Google Scholar
  • 7. Bröcheler, M. , Pugliese, A. , Subrahmanian, V.S. (2012). ‘Efficient multi-view maintenance in the social semantic web’. International World Wide Web Conference, WWW 2012, Lyon, France Google Scholar
  • 8. Chen, S. , Chen, J. , Zhang, X. , Rundensteiner, E.A. (2004a). ‘Detection and correction of conflicting source updates for view maintenance’. International Conference on Data Engineering (ICDE), 436-447 Google Scholar
  • 9. Chen, S. , Liu, B. , Rundensteiner, E.A. (2004b). ‘Multiversionbased view maintenance over multiversion-based view maintenance over distributed data sources’. ACM Transactions on Database Systems. 29, 4, 675-709 Google Scholar
  • 10. Coulon, C. , Pacitti, E. , Valduriez, P. (2005). ‘Consistency management for partial replication in a high performance data cluster’. International Conference on Parallel and Distributed Systems (ICPADS), 809-815 Google Scholar
  • 11. Cyganiak, R. , Jentzsch, A. (2011). Linking Open Data Cloud Diagram. Available online at: http://lod-cloud.net Google Scholar
  • 12. Gries, D. , Schneider, F.B. (1993). A Logical Approach to Discrete Math. New York:Springer-Verlag Google Scholar
  • 13. Gupta, A. , Mumick, I.S. (1995). ‘Maintenance of materialized views:problems, techniques and applications’. Bulletin of the Technical Committee on Data Engineering. 18, 2, Google Scholar
  • 14. Hartig, O. , Bizer, C. , Freytag, J.C. (2009). ‘Executing SPARQL queries over the web of linked data’. The Semantic Web – ISWC 2009, 8th International Semantic Web Conference, 293-309 Google Scholar
  • 15. Hartig, O. , Langegger, A. (2010). ‘A database perspective on consuming linked data on the web’. Datenbank-Spektrum. 10, 2, 57-66 Google Scholar
  • 16. Ibáñez, L.D. , Skaf-Molli, H. , Molli, P. , Corby, O. (2012). ‘Synchronizing semantic stores with commutative replicated data type’. Semantic Web Collaborative Spaces Workshop, at the 21st International World Wide Web Conference (WWW), 1091-1096 Google Scholar
  • 17. Jarke, M. , Lenzerini, M. , Vassiliou, Y. , Vassiliadis, P. (2000). Fundamentals of Data Warehouses. New York:Springer-Verlag Google Scholar
  • 18. Johnson, P.R. , Thomas, R.H. (1976). The Maintenance of Duplicate Databases. Internet RFCs, RFC 677. Available online at: http://www.rfc-editor.org/rfc/rfc677.txt Google Scholar
  • 19. Leach, P. , Mealling, M. , Salz, R. (2005). A Universally Unique Identifier (UUID) URN Namespace. Internet RFCs. RFC 4122, 2005. Available online at: http://www.rfc-editor.org/rfc/rfc4122.txt Google Scholar
  • 20. Ling, T.W. , Sze, E.K. (1999). ‘Materialized view maintenance using version numbers’. International Conference on Database Systems for Advanced Applications, 19–22 April, Hsinchu, Taiwan Google Scholar
  • 21. Mallea, A. , Arenas, M. , Hogan, A. , Polleres, A. (2011). ‘On blank nodes’. International Semantic Web Conference, 421-437 Google Scholar
  • 22. Mattern, F. (1989). ‘Virtual time and global states of distributed systems’. Parallel and Distributed Algorithms. 1, 23, 215-226 Google Scholar
  • 23. Mikami, K. , Morishita, S. , Onizuka, M. (2010). ‘Lazy view maintenance for social networking applications’. International Conference on Database Systems for Advanced Applications (DASFAA), 347-358 Google Scholar
  • 24. Moro, G. , Sartori, C. (2001). ‘Incremental maintenance of multi-source views’. Australasian Database Conference (ADC), 13-20 Google Scholar
  • 25. Morsey, M. , Lehmann, J. , Auer, S. , Stadler, C. , Hellmann, S. (2012). ‘Dbpedia and the live extraction of structured data from Wikipedia’. Program: Electronic Library and Information Systems. 46, 2, 157-181 Google Scholar
  • 26. Özsu, M.T. , Valduriez, P. (2005). Principles of Distributed Database Systems. 3rd ed., Springer Google Scholar
  • 27. Pacitti, E. , Coulon, C. , Özsu, M.T. (2005). ‘Preventive replication in a database cluster’. Distributed and Parallel Databases. 18, 223-251 Google Scholar
  • 28. Passant, A. , Mendes, P.N. (2010). ‘sparqlPUSH: proactive notification of data updates in RDF stores using pubsubhubbub’. Proceedings of the Sixth Workshop on Scripting and Development for the Semantic Web (SFSW). 1 pp Google Scholar
  • 29. Preguiça, N.M. , Shapiro, M. , Matheson, C. (2003). ‘Semanticsbased reconciliation for collaborative and mobile environments’. CoopIS/DOA/ODBASE. 38-55 Google Scholar
  • 30. Saito, Y. , Shapiro, M. (2005). ‘Optimistic replication’. ACM Computer Surveys. 37, 1, 42-81 Google Scholar
  • 31. Schiper, N. , Pedone, F. (2010). ‘Fast, flexible, and highly resilient genuine FIFO and causal multicast algorithms’. Proceedings of the 2010 ACM Symposium on Applied Computing. 418-422 Google Scholar
  • 32. Shapiro, M. , Preguiça, N.M. , Baquero, C. , Zawirski, M. (2011). ‘Conflict-free replicated data types’. Stabilization, Safety, and Security of Distributed Systems – 13th International Symposium, SSS, 386-400 Google Scholar
  • 33. Sun, C. , Jia, X. , Zhang, Y. , Yang, Y. , Chen, D. (1998). ‘Achieving convergence, causality preservation, and intention preservation in real-time cooperative editing systems’. ACM Transactions on Computer-Human Interaction. 5, 1, 63-108 Google Scholar
  • 34. Thomas, R.H. (1979). ‘A majority consensus approach to concurrency control for multiple copy databases’. ACM Transactions Database Systems. 4, 2, 180-209 Google Scholar
  • 35. Tummarello, G. , Morbidoni, C. , Bachmann-Gmür, R. , Erling, O. (2007). ‘Rdfsync: efficient remote synchronization of RDF models’. 6th International and 2nd Asian Semantic Web Conference (ISWC + ASWC), 537-551 Google Scholar
  • 36. Tummarello, G. , Morbidoni, C. , Petersson, J. , Puliti, P. , Piazza, F. (2004). ‘RDF growth, a p2p annotation exchange algorithm for scalable semantic web applications’. Proceedings of the MobiQuitous’04 Workshop on Peer-to-Peer Knowledge Management (P2PKM 2004). Google Scholar
  • 37. Umbrich, J. , Hausenblas, M. , Hogan, A. , Polleres, A. , Decker, S. , Bizer, C. Heath, T. Berners-Lee, T. Hausenblas, M. (2010). ‘Towards dataset dynamics: change frequency of linked open data sources’. LDOW, CEUR Workshop Proceedings. 628, CEUR-WS.org Google Scholar
  • 38. W3C Resource Description Framework (RDF): Concepts and Abstract Syntax. 2004, 02, Available online at: http://www.w3.org/TR/rdf-concepts/ Google Scholar
  • 39. W3C 2012a, 06, RDF 1.1 Concepts and Abstract Syntax Working Draft. Available online at: http://www.w3.org/TR/rdf11-concepts/ Google Scholar
  • 40. W3C SPARQL 1.1 Update. 2012b, 11, Available online at: http://www.w3.org/TR/2012/WD-sparql11-update-20121108/ Google Scholar
  • 41. Wasserman, S. , Faust, K. (1994). Social Network Analysis: Methods and Applications. Cambridge:University Press Google Scholar
  • 42. Weiss, S. , Urso, P. , Molli, P. (2010). ‘Logoot-undo: distributed collaborative editing system on p2p networks’. IEEE Transactions on Parallel Distributed Systems. 21, 8, 1162-1174 Google Scholar
  • 43. Yu, H. , Vahdat, A. (2002). ‘Minimal replication cost for availability’. Symposium on Principles of Distributed Computing (PODC) Google Scholar
  • 44. Zarzour, H. , Sellami, M. (2012). ‘B-set: a synchronization method for distributed semantic stores’. International Conference on Complex Systems (ICCS) Google Scholar
  • 45. Zhang, X. , Rundensteiner, E.A. (2002). ‘Integrating the maintenance and synchronization of data warehouses’. Information Systems. 27, 4, 219-243 Google Scholar
  • 46. Zhuge, Y. , Garcia-Molina, H. , Wiener, J.L. (1998). ‘Consistency algorithms for multi-source warehouse view maintenance’. Distributed and Parallel Databases. 6, 1, 7-40 Google Scholar

Additional References