START Conference Manager    

Scalable Discriminative Parsing for German

Yannick Versley and Ines Rehbein

11th International Conference on Parsing Technology (IWPT 2009)
Paris, France, 7th-9th October, 2009


Summary

Generative lexicalized parsing models, which are the mainstay for probabilistic parsing of English, do not perform as well when applied to languages with different language-specific properties such as free(r) word order or rich morphology in combination with case syncretism.

Unlexicalized PCFG parsing with annotated treebank grammars has been shown to improve performance for German and other non-English languages, while the generative lexicalized models do not seem to be as easily adaptable to these languages.

In this paper, we show how the fine-grained control that annotated treebank grammars allow can be enriched with additional features in a factored discriminative model, gaining additional flexibility with respect to generative models without having to suffer from sparse data problems. We demonstrate the flexibility of the approach by integrating unsupervised PP attachment and POS-based word clusters into the parser.


START Conference Manager (V2.56.8 - Rev. 780)