START Conference Manager |
Unlexicalized PCFG parsing with annotated treebank grammars has been shown to improve performance for German and other non-English languages, while the generative lexicalized models do not seem to be as easily adaptable to these languages.
In this paper, we show how the fine-grained control that annotated treebank grammars allow can be enriched with additional features in a factored discriminative model, gaining additional flexibility with respect to generative models without having to suffer from sparse data problems. We demonstrate the flexibility of the approach by integrating unsupervised PP attachment and POS-based word clusters into the parser.
START Conference Manager (V2.56.8 - Rev. 780)