Botanic Corpus
This page provides access to the different versions of the two corpus we are looking at.
Corpus on Polynesian Flora
text version
with a few typos (numeration, name of the fields, ...) and an encoding problem for some non ASCII chars.
(very preliminary) HTML version
Converted using StarOffice.
XML version
Converted with Perl scripts by Stéphanie Balva.
(preliminary) WEB interface
done by Stéphanie Balva.
(very preliminary) Part of Speech version
Achieved with
TreeTagger
.
(very preliminary) Morphology
Achieved with
Flemm
.
(preliminary) Morphology
Achieved with TreeTagger (retrained on French) and Flemm, done by Lionel Clément.
text version of the glossary
with an encoding problem for some non ASCII chars.
Corpus on Cameroun Flora
a sample in PDF
(very preliminary) text version
with a lot of typos coming from OCR and to be corrected.