Language Resource Management – Morpho-syntactic Annotation Framework (MAF)
Contents
Foreword
Introduction
Scope
Normative references
Terms and definitions
Key standards used by MAF
ISO 12620 Data Category Registry (DCR)
ISO 24610 Feature Structures (FSR and FSD)
OLAC Metadata
Unified Modeling Language (UML)
General characteristics of MAF
Overview
MAF Meta-Model
Segmenting with tokens
Standoff notation
Embedding notation
Informative attributes
Completing the embedding token notation
Joining tokens
Overlapping tokens
Formal description:
token
Word Forms as linguistic units
Token attachment
One token; one word form
Several contiguous tokens; one word form
Several discontinuous tokens; one word form
Zero token; one word form
One token; several word forms
Referring lexicon entries
Compound word forms
Formal description:
wordForm
Morpho-syntactic content
Using feature structures
Compact morpho-syntactic tags
FSR libraries
Designing tagsets
Formal description:
tagset
Handling ambiguities
Word form Content Ambiguities
Lexical Ambiguities
Structural Ambiguities
Structural ambiguities over word forms
Structural ambiguities over tokens
Simplified structuring variants
Non ambiguous linear representation
Mixed linear and lattice representation
Expanding the simplified variants
Separating tokens and word forms
Wrapping into local lattices
Merging local lattices
Removing
wfAlt
Formal description:
wfAlt
and
fsm
Header and metadata
Formal description
(informative) RELAX NG compact schema
Validating MAF documents
(informative) DTD
(informative) Illustrative examples
Tagsets
Demonstrator
(illustrative) Morpho-syntactic Data Categories
(informative) UML notions used within MAF
Introduction
The notion of class
The notion of attribute
The notion of relationship
The notion of association
The notion of aggregation
The notion of generalization
The notion of instance
The notion of package
Graphical notations
References
This document was translated from L
A
T
E
X by
H
E
V
E
A
.