Révisions

MWEs are a real difficulty in parsing.

The main issues are

  • the lack of consensus on defining and capturing MWEs
  • no closed lists or operational specif of MWEs
  • a large diversity of MWE kinds: named entities, terms, locutions, idioms, ...
  • a range of situation going from frozen to semi-productive MWEs

In FRMG, these diverse situations has led to a diversity of solutions, more or less perfect, at all levels, from the meta-grammar level, in the pre-parsing phases, during parsing, in the disambiguisation phase, or even during conversion to some conversion schema.

at the level of SxPipe (named entities and some frozen expressions such as complex csu)

an ambiguous DAG produced by SxPipe with a MWE reading

  • 0
  • 0
Graph

  • 0
  • 0
Graph

at the level of the parser (+ metagrammar) : predicative nouns and light verbs

  • 0
  • 0
Graph

  • 0
  • 0
Graph

at the level of the metagrammar : idiomatic expressions

  • 0
  • 0
Graph

  • 0
  • 0
Graph

  • 0
  • 0
Graph

  • 0
  • 0
Graph

also found "Fin de l'appartheid oblige, ..." in the FTB (where is the limit between a locution like "noblesse oblige" and a productive construction "N oblige" ?)

also quoted constructions interesting for some specific Named Entities. But no clear solution when there is no quotes !

  • 0
  • 0
Graph

at disambiguation level

terms and disamb rules (favoring longest expressions)

the conversions issues for output schema

with different notions and lists of MWEs

FRMG provides outputs following several syntactic annotation schema, such as PASSAGE, FTB/CONLL, or the more recent Universal Dependency (UD) schema for French. Unfortunately, all these schema differ on their notion, list, and representation of MWEs. The conversion process should therefore take care, as much as possible, of these cases.

Some limit cases

  • 0
  • 0
Graph

  • 0
  • 0
Graph