• Medientyp: E-Artikel
  • Titel: Full parsing approximation for information extraction via finite-state cascades
  • Beteiligte: CIRAVEGNA, FABIO; LAVELLI, ALBERTO
  • Erschienen: Cambridge University Press (CUP), 2002
  • Erschienen in: Natural Language Engineering
  • Sprache: Englisch
  • DOI: 10.1017/s1351324902002917
  • ISSN: 1351-3249; 1469-8110
  • Schlagwörter: Artificial Intelligence ; Linguistics and Language ; Language and Linguistics ; Software
  • Entstehung:
  • Anmerkungen:
  • Beschreibung: <jats:p>This paper proposes a robust approach to parsing suitable for Information Extraction (IE) from texts using finite-state cascades. The approach is characterized by the construction of an approximation of the full parse tree that captures all the information relevant for IE purposes, leaving the other relations underspecified. Sequences of cascades of finite-state rules deterministically analyze the text, building unambiguous structures. Initially basic chunks are analyzed; then clauses are recognized and nested; finally modifier attachment is performed and the global parse tree is built. The parsing approach allows robust, effective and efficient analysis of real world texts. The grammar organization simplifies changes, insertion of new rules and integration of domain-oriented rules. The approach has been tested for Italian, English, and Russian. A parser based on such an approach has been implemented as part of <jats:italic>Pinocchio</jats:italic>, an environment for developing and running IE applications.</jats:p>