a tour of FRMG, a French (Meta)Grammar

This set of pages (book) is being developed as a 2 hours tutorial to be delivered at PARSEME-COST workshop (Dubrovnik, 26-27 September 2016).

It will also serve as an English introduction to FRMG and this wiki. Comments are welcome !

It should cover the following points:

a brief description of FRMG and a few words about FRMG wiki
a brief description of Tree Adjoining Grammars (TAGs) : notion of trees, tree operations, pro and cons of large TAGs
a presentation of meta-grammars as a solution to design large coverage grammatical descriptions
- modularity and elementary constraints to ease grammatical descriptions
- inheritance hierarchy
- elementary constraints
  - nodes
  - the class itself (desc)
  - equality
  - precedence
  - dominance (father and ancestor)
  - node and class features
  - equations
  - anonymous nodes
  - macros and other short notations
- resource producers/consumers
- guards as complex constraints
- browsing the classes
getting more compact grammars through tree factorization
- disjunction
- guards
- interleaving or free node ordering
- repetition (Kleene star)
browsing the grammar
- in frmgwiki
- some statistics about the trees
- hypertags to link trees and anchors
Playing with the resulting parser
- trying a few sentences
- playing with disambiguation
- highlighting edges
- the preprocessing steps
  - Tokenizing with SxPipe
  - lexicon Lefff and FRMG lexer
- installing the Alpage processing chain
disambiguation and beyond
- shared forest
- derivations vs dependencies
- hand-crafted disambiguation rules
- injecting some knowledge
the hard life : how to conciliate coverage, accuracy, and efficiency !
- efficiency
  - a few sources of inefficiency (parser & disambiguation)
  - using TIGs
  - factorization
  - lexicalization
  - left-corner (lctag)
  - restrictions
  - guiding (by self training)
    - tagging
    - hypertagging
    - leftcorner restrictions
  - a few stats
- coverage
  - using test suite and regression
  - using error mining
  - using robust partial parsing
  - using correction rules
- accuracy
  - learning from the French TreeBank
  - combining with DyALog-SR, a transition-based statistical parser
  - domain adaptation with unsupervised learning (self-training)
  - feature engineering
FRMG and MWEs
- at the level of SxPipe (named entities and some frozen expressions such as complex csu)
- at the level of the parser (+ metagrammar) : predicative nouns and light verbs
- at the level of the metagrammar : idiomatic expressions
- at disambiguation level (terms)
- the conversions issues for output schema with different notions and lists of MWEs
Discussion(s):
- developing and maintaining a large coverage meta-grammar
- starting a meta-grammar for a new language
- re-using meta-grammar components (hierarchy, classes)
- exploring new target formalisms

Version imprimable
Connectez-vous ou inscrivez-vous pour publier un commentaire

Formulaire de recherche

Connexion utilisateur