Macaon
logo macaon Macaon se situe à mi-chemin de deux courants : d'une part les projets d'établissement de standards d'annotation de corpus et, d'autre part, les projets de développement de chaînes de traitement de TAL.





Description

As other NLP architectures, the Macaon processing chain is modular. Linguistic analysis (enriching tests with linguistic annotations) is performed in several steps, each step corresponding more or less to one module. The exchange of data between modules is achieved thanks to XML an representation of texts and annotations. XML schema define the features of the different modules, as well as their input and output formats and characteristics.

The ongoing work is focused on the implementation of a collection of modules which perform segmentation in sentences, tokenization, named entity recognition, lexical analysis, morphosyntactic tagging, surface parsing.