• Medientyp: E-Artikel
  • Titel: Detecting Inclusion Dependencies on Very Many Tables
  • Beteiligte: Tschirschnitz, Fabian; Papenbrock, Thorsten; Naumann, Felix
  • Erschienen: Association for Computing Machinery (ACM), 2017
  • Erschienen in: ACM Transactions on Database Systems
  • Sprache: Englisch
  • DOI: 10.1145/3105959
  • ISSN: 0362-5915; 1557-4644
  • Schlagwörter: Information Systems
  • Entstehung:
  • Anmerkungen:
  • Beschreibung: <jats:p>Detecting inclusion dependencies, the prerequisite of foreign keys, in relational data is a challenging task. Detecting them among the hundreds of thousands or even millions of tables on the web is daunting. Still, such inclusion dependencies can help connect disparate pieces of information on the Web and reveal unknown relationships among tables.</jats:p> <jats:p> With the algorithm M <jats:sc>any</jats:sc> , we present a novel inclusion dependency detection algorithm, specialized for the very many—but typically small—tables found on the Web. We make use of Bloom filters and indexed bit-vectors to show the feasibility of our approach. Our evaluation on two corpora of Web tables shows a superior runtime over known approaches and its usefulness to reveal hidden structures on the Web. </jats:p>