• Medientyp: Bericht; E-Book
  • Titel: Scalable Data Integration by Mapping Data to Queries
  • Beteiligte: Hentschel, Martin [VerfasserIn]; Kossmann, Donald [VerfasserIn]; Florescu, Daniela [VerfasserIn]; Haas, Laura [VerfasserIn]; Kraska, Tim [VerfasserIn]; Miller, Renée J. [VerfasserIn]
  • Erschienen: Swiss Federal Institute of Technology, 2009
  • Erschienen in: Technical report / [ETH, Department of Computer Science, 633
  • Sprache: Englisch
  • DOI: https://doi.org/20.500.11850/69832; https://doi.org/10.3929/ethz-a-006835897
  • Schlagwörter: QUERIES (INFORMATION SYSTEMS) ; computer science ; SPEZIELLE PROGRAMMIERMETHODEN ; Data processing ; INFORMATIONSSPEICHERUNG + INFORMATIONSGEWINNUNG (INFORMATIONSSYSTEME) ; SPECIAL PROGRAMMING METHODS ; ABFRAGEN (INFORMATIONSSYSTEME) ; INFORMATION STORAGE + INFORMATION RETRIEVAL (INFORMATION SYSTEMS)
  • Entstehung:
  • Anmerkungen: Diese Datenquelle enthält auch Bestandsnachweise, die nicht zu einem Volltext führen.
  • Beschreibung: The goal of a data integration system is to allow users to query diverse information sources through a schema that is familiar to them. However, there may be many different users who may have dif- ferent preferred schemas, and the data may be stored in data sources which use still other schemas. To integrate data, mapping rules must be defined to map entities of the data sources to entities of the users’ schemas. In large information systems with many data sources which serve sophisticated applications, there can be many such mapping rules and they can be complex. The purpose of this paper is to study the per- formance of alternative query processing techniques for data integration systems with many complex mapping rules. A new approach, mapping data to queries (MDQ), is presented. Through extensive performance experiments, it is shown that this approach performs well for complex mapping rules and queries, and scales significantly better with the num- ber of rules than the state of the art, which is based on query rewrite. In fact, the performance is close to that of an ideal system in which there is only a single schema used by all sources and queries.
  • Zugangsstatus: Freier Zugang
  • Rechte-/Nutzungshinweise: Urheberrechtsschutz - Nicht kommerzielle Nutzung gestattet