a tour of FRMG, a French (Meta)Grammar
This set of pages (book) is being developed as a 2 hours tutorial to be delivered at PARSEME-COST workshop (Dubrovnik, 26-27 September 2016).
It will also serve as an English introduction to FRMG and this wiki. Comments are welcome !
It should cover the following points:
- a brief description of FRMG and a few words about FRMG wiki
- a brief description of Tree Adjoining Grammars (TAGs) : notion of trees, tree operations, pro and cons of large TAGs
- a presentation of meta-grammars as a solution to design large coverage grammatical descriptions
- modularity and elementary constraints to ease grammatical descriptions
- inheritance hierarchy
- elementary constraints
- nodes
- the class itself (desc)
- equality
- precedence
- dominance (father and ancestor)
- node and class features
- equations
- anonymous nodes
- macros and other short notations
- resource producers/consumers
- guards as complex constraints
- browsing the classes
- getting more compact grammars through tree factorization
- disjunction
- guards
- interleaving or free node ordering
- repetition (Kleene star)
- browsing the grammar
- in frmgwiki
- some statistics about the trees
- hypertags to link trees and anchors
- Playing with the resulting parser
- trying a few sentences
- playing with disambiguation
- highlighting edges
- the preprocessing steps
- Tokenizing with SxPipe
- lexicon Lefff and FRMG lexer
- installing the Alpage processing chain
- disambiguation and beyond
- shared forest
- derivations vs dependencies
- hand-crafted disambiguation rules
- injecting some knowledge
- the hard life : how to conciliate coverage, accuracy, and efficiency !
- efficiency
- a few sources of inefficiency (parser & disambiguation)
- using TIGs
- factorization
- lexicalization
- left-corner (lctag)
- restrictions
- guiding (by self training)
- tagging
- hypertagging
- leftcorner restrictions
- a few stats
- coverage
- using test suite and regression
- using error mining
- using robust partial parsing
- using correction rules
- accuracy
- learning from the French TreeBank
- combining with DyALog-SR, a transition-based statistical parser
- domain adaptation with unsupervised learning (self-training)
- feature engineering
- efficiency
- FRMG and MWEs
- at the level of SxPipe (named entities and some frozen expressions such as complex csu)
- at the level of the parser (+ metagrammar) : predicative nouns and light verbs
- at the level of the metagrammar : idiomatic expressions
- at disambiguation level (terms)
- the conversions issues for output schema with different notions and lists of MWEs
- Discussion(s):
- developing and maintaining a large coverage meta-grammar
- starting a meta-grammar for a new language
- re-using meta-grammar components (hierarchy, classes)
- exploring new target formalisms
- Version imprimable
- Connectez-vous ou inscrivez-vous pour publier un commentaire