Révisions

This set of pages is developed as a 2 hours tutorial for PARSEME-COST workshop (Dubrovnik, 26-27 September 2016)

It should cover:

  • a brief description of FRMG and a few words about FRMG wiki
  • a brief description of Tree Adjoining Grammars (TAGs) : notion of trees, tree operations, pro and cons of large TAGs
  • a presentation of meta-grammars as a solution to design large coverage grammatical descriptions
  • getting more compact grammars through tree factorization
    • disjunction
    • guards
    • interleaving or free node ordering
    • repetition (Kleene star)
  • browsing the grammar
    • in frmgwiki
    • some statistics about the trees
  • a first view of the resulting parser
    • trying a few sentences
    • playing with disambiguation
    • the preprocessing steps
      • Tokenizing with SxPipe
      • lexicon Lefff and lexer
  • disambiguation
    • hand-crafted rules
    • injecting some knowledge
    • tuning with the French Tree Bank
  • the hard life : how to conciliate coverage, accuracy, and efficiency !
    • efficiency
      • factorization
      • lexicalization
      • left-corner (lctag)
      • guiding (by self training)
        • tagging
        • hypertagging
        • leftcorner restrictions
    • coverage
      • using test suite and regression
      • using error mining
      • using robust partial parsing
      • using correction rules
    • accuracy
      • feature engineering
      • domain adaptation with unsupervised learning (self-training)
      • combining with DyALog-SR, a transition-based statistical parser
  • FRMG and MWEs
    • at the level of SxPipe (named entities and some frozen expressions such as complex csu)
    • at the level of the parser (+ metagrammar) : predicative nouns and light verbs
    • at the level of the metagrammar : idiomatic expressions
    • at disambiguation level (terms)
    • the conversions issues for output schema with different notions and lists of MWEs