• Medientyp: E-Artikel
  • Titel: Bayesian Tree Substitution Grammars as a Usage-based Approach
  • Beteiligte: Post, Matt; Gildea, Daniel
  • Erschienen: SAGE Publications, 2013
  • Erschienen in: Language and Speech
  • Sprache: Englisch
  • DOI: 10.1177/0023830913484901
  • ISSN: 0023-8309; 1756-6053
  • Schlagwörter: Speech and Hearing ; Linguistics and Language ; Sociology and Political Science ; Language and Linguistics ; General Medicine
  • Entstehung:
  • Anmerkungen:
  • Beschreibung: <jats:p> Tree substitution grammar (TSG) is a generalization of context-free grammar (CFG) that permits non-terminals to rewrite as fragments of arbitrary size, instead of just depth-one productions. We discuss connections between the TSG framework and the larger family of usage-based approaches to language, showing how TSG allows us to make some of the claims of these approaches sufficiently concrete for computational modeling. </jats:p><jats:p> A fundamental difficulty in defining a TSG is to determine the set of fragments for the grammar, because the set of possible fragments is exponential in the size of the parse trees from which TSGs are typically learned. We describe a model-based approach that learns a TSG using Gibbs sampling with a non-parametric prior to control fragment size, yielding grammars that contain mostly small fragments but that include larger ones as the data permits. We evaluate these grammars on two tasks (parsing accuracy and grammaticality classification), and find that these Bayesian TSGs achieve excellent performance on two tasks relative to a set of heuristically extracted TSGs spanning the spectrum of representations, from a standard depth-one context-free Treebank grammar to explicit approximations of the Data-Oriented Parsing model. </jats:p>