Przyjaciel-Zablocki, Martin
[VerfasserIn];
Schätzle, Alexander
[VerfasserIn];
Skaley, Eduard
[VerfasserIn];
Hornung, Thomas
[VerfasserIn];
Lausen, Georg
[VerfasserIn]
Map-side merge joins for scalable SPARQL BGP processing
Sie können Bookmarks mittels Listen verwalten, loggen Sie sich dafür bitte in Ihr SLUB Benutzerkonto ein.
Medientyp:
E-Artikel
Titel:
Map-side merge joins for scalable SPARQL BGP processing
Beteiligte:
Przyjaciel-Zablocki, Martin
[VerfasserIn];
Schätzle, Alexander
[VerfasserIn];
Skaley, Eduard
[VerfasserIn];
Hornung, Thomas
[VerfasserIn];
Lausen, Georg
[VerfasserIn]
Erschienen:
University of Freiburg: FreiDok, 2013
Erschienen in:IEEE 5th International Conference on Cloud Computing Technology and Science (CloudCom), 2013 : 2 - 5 Dec. 2013, Bristol, United Kingdom ; [including workshops] / sponsors: IEEE Computer Society . Piscataway, NJ: IEEE, 2013, Seite 631-749. DOI 10.1109/CloudCom.2013.9, ISBN: 978-0-7695-5095-4
Anmerkungen:
Diese Datenquelle enthält auch Bestandsnachweise, die nicht zu einem Volltext führen.
Beschreibung:
In recent times, it has been widely recognized that, due to their inherent scalability, frameworks based on MapReduce are indispensable for so-called "Big Data" applications. However, for Semantic Web applications using SPARQL, there is still a demand for sophisticated MapReduce join techniques for processing basic graph patterns, which are at the core of SPARQL. Renowned for their stable and efficient performance, sort-merge joins have become widely used in DBMSs. In this paper, we demonstrate the adaptation of merge joins for SPARQL BGP processing with MapReduce. Our technique supports both n-way joins and sequences of join operations by applying merge joins within the map phase of MapReduce while the reduce phase is only used to fulfill the preconditions of a subsequent join iteration. Our experiments with the LUBM benchmark show an average performance benefit between 15% and 48% compared to other MapReduce based approaches while at the same time scaling linearly with the RDF dataset size.