10 Header and metadata
The global maf element is introduced as a root element to
encapsulate morpho-syntactic annotations and carries global metadata
relative to the annotated documents.
Two MAF specific metadata categories are introduced for the token
standoff notation, namely the document and addressing
attributes. The addressing attribute indicates the addressing
schema used to refer positions in the annotated document. A full list
of such schema will be provided in ISO 24612 proposal
“Linguistic Annotation Framework” (LAF). The following
fragment illustrates the use of these attributes for a video document:
<maf document="interview.mpeg" addressing="mpeg7">
<token id="t0"
from="T00:01:16:4484F30000"
to="T00:01:16:14494F30000"
transcription="mister"/>
<wordForm tokens="t0" lemma="mister"> ... </wordForm>
...
</maf>
The other non-mandatory metadata are handled following the
recommendations of the OLAC Metadata Standard and should
therefore be included in an olac:olac element.
<maf document="http://abu.cnam.fr/cgi-bin/donner_abu?tdm80j2"
addressing="char_offset">
<olac:olac xmlns:olac="http://www.language-archives.org/OLAC/1.0/"
xmlns="http://purl.org/dc/elements/1.1/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.language-archives.org/OLAC/1.0/
http://www.language-archives.org/OLAC/1.0/olac.xsd">
<creator>MySuperMorphoTool</creator>
<created>2005/09/30</created>
<hasVersion>1.1</hasVersion>
<identifier>TDM80MAF.1.1</identifier>
<replaces>TDM80MAF.1.0</replaces>
<requires>http://abu.cnam.fr/cgi-bin/donner_abu?tdm80j2</requires>
<language xsi:type="olac:language" olac:code="fr">French</language>
<publisher>MyInstitution</publisher>
<title xml:lang="fr">Le Tour du Monde en 80 Jours version
MAF</title>
<abstract xml:lang="en"> A set of MAF annotations for Jules Vernes famous novel
<abstract>
<rightHolder>MyInstitution</rightHolder>
<license>LGPL-LR</license>
</olac:olac>
...
</maf>
10.1 Formal description
start =
element maf {
( maf.document,
maf.addressing )? ,
tagset ?,
maf.metadata ?,
maf.flow
}
maf.document = attribute document { xsd:anyURI }
maf.addressing = attribute addressing { xsd:NMTOKEN }
maf.metadata |= notAllowed # to be imported from OLAC
The complete list of addressing schema allowed by MAF will be
inherited from ISO 24612 document on Linguistic Annotation Framework
(LAF). A possible list of such schema could include:
-
TEI ptrs,
- XML Xpointers,
- character offsets (depending on the original document encoding)
- MPEG7 multimedia addressing (MediaTimePointType)