Révisions

This set of pages is developed as a 2 hours tutorial for PARSEME-COST workshop (Dubrovnik, 26-27 September 2016). It will also serve as an English introduction to FRMG.

It should cover:

  • a brief description of FRMG and a few words about FRMG wiki
  • a brief description of Tree Adjoining Grammars (TAGs) : notion of trees, tree operations, pro and cons of large TAGs
  • a presentation of meta-grammars as a solution to design large coverage grammatical descriptions
    • modularity and elementary constraints to ease descriptions
    • inheritance hierachy
    • elementary constraints
      • nodes
      • the class itself (desc)
      • equality
      • precedence
      • dominance (father and ancestor)
      • node and class features
      • equations
      • anonymous nodes
      • macros and other short notations
    • resource producers/consummers
    • guards as complex constraints
    • browsing the classes
  • getting more compact grammars through tree factorization
    • disjunction
    • guards
    • interleaving or free node ordering
    • repetition (Kleene star)
  • browsing the grammar
    • in frmgwiki
    • some statistics about the trees
  • a first view of the resulting parser
    • trying a few sentences
    • playing with disambiguation
    • hilighting edges
    • the preprocessing steps
      • Tokenizing with SxPipe
      • lexicon Lefff and lexer
  • disambiguation
    • hand-crafted rules
    • injecting some knowledge
    • tuning with the French Tree Bank
  • the hard life : how to conciliate coverage, accuracy, and efficiency !
    • efficiency
      • a few sources of inefficiency (parser & disambiguation)
      • using TIGs
      • factorization
      • lexicalization
      • left-corner (lctag)
      • restrictions
      • guiding (by self training)
        • tagging
        • hypertagging
        • leftcorner restrictions
      • a few stats
    • coverage
      • using test suite and regression
      • using error mining
      • using robust partial parsing
      • using correction rules
    • accuracy
      • feature engineering
      • domain adaptation with unsupervised learning (self-training)
      • combining with DyALog-SR, a transition-based statistical parser
  • FRMG and MWEs
    • at the level of SxPipe (named entities and some frozen expressions such as complex csu)
    • at the level of the parser (+ metagrammar) : predicative nouns and light verbs
    • at the level of the metagrammar : idiomatic expressions
    • at disambiguation level (terms)
    • the conversions issues for output schema with different notions and lists of MWEs
  • Discussion(s):
    • developing and maintaining a large coverage meta-grammar
    • starting a meta-grammar for a new language
    • re-using meta-grammar components (hierarchy, classes)
    • exploring new target formalisms