START Conference Manager    

Parsing Formal Languages using Natural Language Parsing Techniques

Jens Nilsson, Welf Löwe, Johan Hall and Joakim Nivre

11th International Conference on Parsing Technology (IWPT 2009)
Paris, France, 7th-9th October, 2009


Summary

Program analysis tools used in software maintenance must be robust and ought to be accurate—classical parsers are accurate but not robust. Data driven parsing approaches are both robust to 100% and highly accurate. We apply data driven dependency parsing to software. The accuracy observed for programming languages as Java, C/C++, and Python is over 90%, way above the accuracy for natural languages. Further experiments indicate that post-processing can (almost) completely remove the remaining errors. Finally, the training data for instantiating the generic parser can be generated automatically for formal languages, opposed to the manually developed treebanks used for natural languages. Altogether, this approach improves the robustness of software maintenance tools without showing a significant negative affect on their accuracy.


START Conference Manager (V2.56.8 - Rev. 780)