IWPT09: Conference Programme

Conference Programme

The proceedings may also be found online on ACL Anthology

Click on titles to see the abstracts

Wednesday, October 7, 2009

9:00--9:15	Opening Remarks
9:15--10:15	Invited Talk by John Carroll Moving Parsing into the Real World: Noisy Text, Grammatical Representations and Applications
	Much recent research in natural language parsing takes as input carefully crafted, edited text, often from newspapers. However, many real-world applications involve processing text which is not written carefully by a native speaker, is produced for an eventual audience of only one, and is in essence ephemeral. In this talk I will present a number of research and commercial applications of this type which I and collaborators are developing, in which we parse text as diverse as mobile phone text messages, non-native language learner essays, internet chat, and primary care medical notes. I will discuss the problems these types of text pose for a parser, and outline how we integrate information from parsing into applications.
Coffee Break and Poster Display
	Chair: Éric de la Clergerie
10:45--11:15	Parsing Algorithms based on Tree Automata Andreas Maletti and Giorgio Satta PDF\|BibTeX
	We investigate several algorithms related to the parsing problem for weighted automata, under the assumption that the input is a string rather than a tree. This assumption is motivated by several natural language processing applications. We provide algorithms for the computation of parse-forests, best tree probability, inside probability (called partition function), and prefix probability. Our algorithms are obtained by extending to weighted tree automata the Bar-Hillel technique, as defined for context-free grammars.
11:15--11:45	Weighted parsing of trees Mark-Jan Nederhof PDF\|BibTeX
	We show how parsing of trees can be formalized in terms of the intersection of two tree languages. The focus is on weighted regular tree grammars and weighted tree adjoining grammars. Potential applications are discussed, such as parameter estimation across formalisms.
11:45--12:20	Short Paper Session I (Chair: Alexis Nasr)
	Automatic Adaptation of Annotation Standards for Dependency Parsing -- Using Projected Treebank as Source Corpus Wenbin Jiang and Qun Liu PDF\|BibTeX
	We describe for dependency parsing the annotation adaptation strategy, which can automatically transfers knowledge from a source corpus with a different annotation standard to the desired target parser, with the supervision by a target corpus annotated in the desired standard. Furthermore, instead of a hand-annotated one, a projected treebank derived from a bilingual corpus is used as the source corpus. This benefits the resource-scarce languages which haven't different hand-annotated treebanks. Experiments show the target parser gains significant improvement over the baseline parser trained on the target corpus only, especially when the size of target corpus is smaller.
	Learning Stochastic Bracketing Inversion Transduction Grammars with a Cubic Time Biparsing Algorithm Markus Saers, Joakim Nivre and Dekai Wu PDF\|BibTeX
	We present a method for heavily pruning the estimation of Stochastic Bracketing Inversion Transduction Grammars. The estimated grammars are evaluated by building a standard Phrase-based statistical MT system on top of the alignments dictated by the Viterbi parse of the grammar. The trade-off between speed and translation quality is discussed, and compared to a standard word alignment system.
	Empirical lower bounds on translation unit error rate for the full class of inversion transduction grammars Anders Søgaard and Dekai Wu PDF\|BibTeX
	Empirical lower bounds studies in which the frequency of alignment configurations that cannot be induced by a particular formalism is estimated, have been important for the development of syntax-based machine translation formalisms. The formalism that has received most attention has been inversion transduction grammars (ITGs) (Wu, 1997). All previous work on the coverage of ITGs, however, concerns parse failure rates (PFRs) or sentence level coverage, which is not directly related to any of the evaluation measures used in machine translation. Søgaard and Kuhn (2009) induce lower bounds on translation unit error rates (TUERs) for a number of formalisms, incl.~normal form ITGs, but not for the full class of ITGs. Many of the alignment configurations that cannot be induced by normal form ITGs can be induced by unrestricted ITGs, however. This paper estimates the difference and shows that the average reduction in lower bounds on TUER is 2.48 in absolute difference (16.01 in average parse failure rate).
Lunch
	Chair: Harry Bunt
14:00--14:30	Predictive Text Entry using Syntax and Semantics Sebastian Ganslandt, Jakob Jörwall and Pierre Nugues PDF\|BibTeX
	Most cellular telephones use numeric keypads, where texting is supported by dictionaries and frequency models. Given a key sequence, the entry system recognizes the matching words and proposes a rank-ordered list of candidates. The ranking quality is instrumental to an effective entry. This paper describes a new method to enhance entry that combines syntax and language models. We first investigate components to improve the ranking step: language models and semantic relatedness. We then introduce a novel syntactic model to capture the word context, optimize ranking, and then reduce the number of keystrokes per character (KSPC) needed to write a text. We finally combine this model with the other components and we discuss the results. We show that our syntax-based model reaches an error reduction in KSPC of 12.4\% on a Swedish corpus over a baseline using word frequencies. We also show that bigrams are superior to all the other models. However, bigrams have a memory footprint that is unfit for most devices. Nonetheless, bigrams can be further improved by the addition of syntactic models with an error reduction that reaches 29.4\%.
14:30--15:00	Parsing Formal Languages using Natural Language Parsing Techniques Jens Nilsson, Welf Löwe, Johan Hall and Joakim Nivre PDF\|BibTeX
	Program analysis tools used in software maintenance must be robust and ought to be accurate?classical parsers are accurate but not robust. Data driven parsing approaches are both robust to 100% and highly accurate. We apply data driven dependency parsing to software. The accuracy observed for programming languages as Java, C/C++, and Python is over 90%, way above the accuracy for natural languages. Further experiments indicate that post-processing can (almost) completely remove the remaining errors. Finally, the training data for instantiating the generic parser can be generated automatically for formal languages, opposed to the manually developed treebanks used for natural languages. Altogether, this approach improves the robustness of software maintenance tools without showing a significant negative affect on their accuracy.
15:00--16:00	Short Paper Session II (Chair: Pierre Boullier)
	An Incremental Earley Parser for Simple Range Concatenation Grammar Laura Kallmeyer and Wolfgang Maier PDF\|BibTeX
	We present an Earley-style parser for simple range concatenation grammar, a formalism strongly equivalent to linear context-free rewriting systems. Furthermore, we present different filters which reduce the number of items in the parsing chart. An implementation shows that parses can be obtained in a reasonable time.
	Deductive Parsing in Interaction Grammars Joseph Le Roux PDF\|BibTeX
	We present a parsing algorithm for Interaction Grammars using the deductive parsing framework. This approach brings new perspectives to this problem, departing from previous methods which rely on constraint-solving techniques.
	Synchronous Rewriting in Treebanks Laura Kallmeyer, Wolfgang Maier and Giorgio Satta PDF\|BibTeX
	Several formalisms have been proposed for modeling trees with discontinuous phrases. Some of these formalisms allow for synchronous rewriting. However, it is unclear whether synchronous rewriting is a necessary feature. This is an important question, since synchronous rewriting greatly increases parsing complexity. We present a characterization of recursive synchronous rewriting in constituent treebanks with discontinuous annotation. An empirical investigation reveals that synchronous rewriting is actually a necessary feature. Furthermore, we transfer this property to grammars extracted from treebanks.
	An Improved Oracle for Dependency Parsing with Online Reordering Joakim Nivre, Marco Kuhlmann and Johan Hall PDF\|BibTeX
	We present an improved training strategy for dependency parsers that use online reordering to handle non-projective trees. The new strategy improves both efficiency and accuracy by reducing the number of swap operations performed on non-projective trees by up to 80%. We present state-of-the-art results for five languages with the best ever reported results for Czech.
	Two stage constraint based hybrid approach to free word order language dependency parsing Akshar Bharati, Samar Husain, Dipti Misra and Rajeev Sangal PDF\|BibTeX
	The paper describes the overall design of a new two stage constraint based hybrid approach to dependency parsing. We define the two stages and show how different grammatical construct are parsed at appropriate stages. This division leads to selective identification and resolution of specific dependency relations at the two stages. Furthermore, we show how the use of hard constraints and soft constraints helps us build an efficient and robust hybrid parser. Finally, we evaluate the implemented parser on Hindi and compare the results with that of two data driven dependency parsers.
Coffee Break and Poster Display
16:35--17:00	Short Paper Session III (Chair: Harry Bunt)
	Analysis of Discourse Structure with Syntactic Dependencies and Data-Driven Shift-Reduce Parsing Kenji Sagae PDF\|BibTeX
	We present an efficient approach for discourse parsing within and across sentences, where the unit of processing is an entire document, and not a single sentence. We apply shift-reduce algorithms for dependency and constituent parsing to determine syntactic dependencies for the sentences in a document, and subsequently a Rhetorical Structure Theory (RST) tree for the entire document. Our results show that our shift-reduce framework achieves high accuracy and a large improvement in efficiency compared to a state-of-the-art dynamic programming approach.
	Evaluating Contribution of Deep Syntactic Information to Shallow Semantic Analysis Sumire Uematsu and Jun'ichi Tsujii PDF\|BibTeX
	This paper presents shallow semantic parsing based on only HPSG parses. We constructed an HPSG-FrameNet map from a semantically annotated corpus, and performed semantic parsing by mapping HPSG dependencies to FrameNet relations. The semantic parsing was evaluated in a Senseval-3 task, and the promising result suggested high contribution of syntactic information to semantic analysis.
	Chair: Stephen Clark
17:00--17:30	Weight Pushing and Binarization for Fixed-Grammar Parsing Matt Post and Daniel Gildea PDF\|BibTeX
	We apply the idea of weight pushing (Mohri, 1997) to CKY parsing with fixed context-free grammars. Applied after rule binarization, weight pushing takes the weight from the original grammar rule and pushes it down across its binarized pieces, allowing the parser to make better pruning decisions earlier in the parsing process. This process can be viewed as generalizing weight pushing from transducers to hypergraphs. We examine its effect on parsing efficiency with various binarization schemes applied to tree substitution grammars from previous work. We find that weight pushing produces dramatic improvements in efficiency, especially with small amounts of time and with large grammars.
17:30--18:00	Co-Parsing with Competitive Models Lidia Khmylko, Kilian A. Foth and Wolfgang Menzel PDF\|BibTeX
	We present an asymmetric approach to a run-time combination of two parsers where one component serves as a predictor to the other one. Predictions are integrated by means of weighted constraints and therefore are subject to preferential decisions. Previously, the same architecture has been successfully used with predictors providing partial or inferior information about the parsing problem. It has now been applied to a situation where the predictor produces exactly the same type of information at a fully competitive quality level. Results show that the combined system outperforms its individual components, even though their performance in isolation is already fairly high.

Thursday, October 8, 2009

9:00--10:00	Invited Talk by Mark Johnson Learning Rules with Adaptor Grammars
	Nonparametric Bayesian methods are interesting because they may provide a way of learning the appropriate units of generalization (i.e., the "rules" of a grammar) as well as the generalization's probability or weight (i.e., the rule's probability). Adaptor Grammars are a framework for stating a variety of hierarchical nonparametric Bayesian models, where the units of generalization can be viewed as kinds of PCFG rules. This talk describes the mathematical and computational properties of Adaptor Grammars and linguistic applications such as word segmentation, syllabification and named entity recognition. The later part of the talk reviews MCMC inference and describes the MCMC algorithms we use to sample adaptor grammars.
Coffee Break and Poster Display
	Chair:Mark-Jan Nederhof
10:30--11:00	Capturing Consistency between Intra-clause and Inter-clause Relations in Knowledge-rich Dependency and Case Structure Analysis Daisuke Kawahara and Sadao Kurohashi PDF\|BibTeX
	We present a method for dependency and case structure analysis that captures the consistency between intra-clause relations (i.e., case structures or predicate-argument structures) and inter-clause relations. We assess intra-clause relations on the basis of case frames and inter-clause relations on the basis of inference knowledge between case frames. Both knowledge bases are automatically acquired from a massive amount of parses of a Web corpus. The significance of this study is that the proposed method selects the best dependency and case structure that are consistent within each clause and between clauses. We confirm that this method contributes to the improvement of dependency parsing of Japanese.
11:00--11:30	Constructing parse forests that include exactly the n-best PCFG trees Pierre Boullier, Alexis Nasr and Benoît Sagot PDF\|BibTeX
	This paper describes and compares two algorithms that take as input a shared PCFG parse forest and produce shared forests that contain exactly the n most likely trees of the initial forest. Such forests are suitable for subsequent processing, such as (some types of) reranking or LFG f-structure computation, that can be performed ontop of a shared forest, but that may have a high (e.g., exponential) complexity w.r.t. the number of trees contained in the forest. We evaluate the performances of both algorithms on real-scale NLP forests generated with a PCFG extracted from the Penn Treebank.
11:30--12:30	Short Paper Session IV (Chair: Djamé Seddah)
	Hebrew Dependency Parsing: Initial Results Yoav Goldberg and Michael Elhadad PDF\|BibTeX
	We describe a newly available Hebrew Dependency Treebank, which is extracted from the Hebrew (constituency) Treebank. We establish some baseline unlabeled dependency parsing performance on Hebrew, based on two state-of-the-art parsers, MST-parser and MaltParser. The evaluation is performed both in an artificial setting, in which the data is assumed to be properly morphologically segmented and POS-tagged, and in a real-world setting, in which the parsing is performed on automatically segmented and POS-tagged text. We present an evaluation measure that takes into account the possibility of incompatible token segmentation between the gold standard and the parsed data. Results indicate that (a) MST-parser performs better on Hebrew data than MaltParser, and (b) both parsers do not make good use of morphological information when parsing Hebrew.
	Scalable Discriminative Parsing for German Yannick Versley and Ines Rehbein PDF\|BibTeX
	Generative lexicalized parsing models, which are the mainstay for probabilistic parsing of English, do not perform as well when applied to languages with different language-specific properties such as free(r) word order or rich morphology in combination with case syncretism. Unlexicalized PCFG parsing with annotated treebank grammars has been shown to improve performance for German and other non-English languages, while the generative lexicalized models do not seem to be as easily adaptable to these languages. In this paper, we show how the fine-grained control that annotated treebank grammars allow can be enriched with additional features in a factored discriminative model, gaining additional flexibility with respect to generative models without having to suffer from sparse data problems. We demonstrate the flexibility of the approach by integrating unsupervised PP attachment and POS-based word clusters into the parser.
	Improving generative statistical parsing with semi-supervised word clustering Marie Candito and Benoît Crabbé PDF\|BibTeX
	We present a semi-supervised method to improve statistical parsing performance. We focus on the well-known problem of lexical data sparseness and present experiments of word clustering prior to parsing. We use a combination of lexicon-aided morphological clustering that preserves tagging ambiguity, and unsupervised word clustering, trained on a large unannotated corpus. We apply these clusterings to the French Treebank, train with the PCFG-LA unlexicalized algorithm of (Petrov et al.,06). We find a gain in French parsing performance: we improve results from a baseline of F1=86.76% to F1=87.37% using morphological clustering, and up to F1=88.29% using further unsupervised clustering. This is the best known score for French probabilistic parsing. These preliminary results are very encouraging for statistically parsing morphologically-rich languages, and languages with small amount of annotated data.
	Application of feature propagation to dependency parsing Kepa Bengoetxea and Koldo Gojenola PDF\|BibTeX
	This paper presents a set of experiments per-formed on parsing the Basque Dependency Treebank. We have applied feature propaga-tion to dependency parsing, experimenting the propagation of several of the most important morphosyntactic feature values from: a) auxil-iary verbs to the main verb, in verb phrases b) the last constituent to the head noun, in noun phrases c) the last conjunct to the conjunction, in coordination. In the experiments we have used the output of a parser to enrich the input of a second parser. Both parsers have been generated by Maltparser, a freely data-driven dependency parser generator. The transforma-tions, combined with the pseudoprojective graph transformation, obtain a LAS of 77.12% improving the best reported results for Basque.
	Guessing the Grammatical Function of a Non-Root F-Structure in LFG Anton Bryl, Josef Van Genabith and Yvette Graham PDF\|BibTeX
	Lexical-Functional Grammar (Kaplan and Bresnan, 1982) f-structures are bilexical labelled dependency representations. We show that the Naive Bayes classifier is able to guess missing grammatical function labels (i.e. bilexical dependency labels) with reasonably high accuracy (82?91%). In the experiments we use f-structure parser output for English and German Europarl data, automatically ?broken? by replacing grammatical function labels with a generic UNKNOWN label and asking the classifier to restore the label.
Lunch
	Chair: Joakim Nivre
14:00--14:30	Cross parser evaluation : a French Treebanks study Djamé Seddah, Marie Candito and Benoît Crabbé PDF\|BibTeX
	This paper presents preliminary investigations on the statistical parsing of French by bringing a complete evaluation on French data of the main based probabilistic lexicalized (Charniak, Collins, Chiang) and unlexicalized (Berkeley) parsers designed first on the Penn Treebank. We adapted the parsers on the two existing treebanks of French \cite{abeille:03,schluter:07}. To our knowledge, all the results reported here are state-of-the-art for the constituent parsing of French on every available treebank. Regarding the algorithms, the comparisons show that lexicalized parsing models are outperformed by the unlexicalized Berkeley parser. Regarding the treebanks, we observe that a tag set with a specific feature, has direct influences over evaluation results depending on the parsing model
14:30--15:00	Transition-Based Parsing of the Chinese Treebank using a Global Discriminative Model Yue Zhang and Stephen Clark PDF\|BibTeX
	Transition-based approaches have shown competitive performance on constituent and dependency parsing of Chinese. State-of-the-art accuracies have been achieved by a deterministic shift-reduce parsing model on parsing the Chinese Treebank 2 data (Wang et al., 2006). In this paper, we propose a global discriminative model based on the shift-reduce parsing process, combined with a beam-search decoder, obtaining competitive accuracies on CTB2. We also report the performance of the parser on CTB5 data, obtaining the highest scores in the literature for a dependency-based evaluation.
15:00--15:25	Short Paper Session V (Chair: Valia Kordoni)
	Grammar Error Detection with Best Approximated Parse Jean-Philippe Prost PDF\|BibTeX
	In this paper, we propose that grammar error detection be disambiguated in generating the connected parse(s) of optimal merit for the full input utterance, in overcoming the cheapest error. The detected error(s) are described as violated grammatical constraints in a framework for Model-Theoretic Syntax (MTS). We present a parsing algorithm for MTS, which only relies on a grammar of well-formedness, in that the process does not require any extra-grammatical resources, additional rules for constraint relaxation or error handling, or any recovery process.
	The effect of correcting grammatical errors on parse probabilities Joachim Wagner and Jennifer Foster PDF\|BibTeX
	We parse the sentences in three parallel error corpora using a generative, probabilistic English parser, and for each grammatical/ungrammatical sentence pair, we compare the parse probabilities of the most likely analyses. We examine the effect of particular error types on parse probability.
Coffee Break and Poster Display
16:00--18:15	Panel: Statistical Parsing for Morphologically-rich Languages (Chair: Josef van Genabith)

Conference dinner (19:30 at Restaurant Le Procope)

Friday, October 9, 2009

9:00-10:00	Invited Talk by Joakim Nivre Discontinuous Dependency Parsing
	There is a strong tendency in natural language syntax such that elements that have a direct syntactic relation are also adjacent in the surface realization of a sentence. Nevertheless, notable exceptions to this generalization exist in practically all languages and are especially common in languages with free or flexible word order. Syntactic theorists, on the one hand, have developed a variety of representational devices for dealing with these exceptions, including phonetically null elements, gap threading, and non-projective dependency trees. Syntactic parsers, on the other hand, use these devices very restrictively since they add to the complexity of an already daunting task. This is especially true of data-driven parsers, where discontinuity is often simply ignored. In this talk, I will review techniques for dealing with discontinuous structures in the framework of dependency parsing, focusing on parsing algorithms that build structures from non-adjacent elements and in particular transition-based algorithms that use online reordering.
Coffee Break and Poster Display
	Chair:John Carroll
10:30--11:00	Effective Analysis of Causes and Inter-dependencies of Parsing Errors Tadayoshi Hara, Yusuke Miyao and Jun'ichi Tsujii PDF\|BibTeX
	In this paper, we propose two methods for analyzing errors in parsing. One is to classify errors into categories which grammar developers can easily associate with defects in grammar or a parsing model and thus its improvement. The other is to discover inter-dependencies among errors, and thus grammar developers can focus on errors which are crucial for improving the performance of a parsing model. The first method uses patterns of errors to associate them with categories of causes for those errors, such as errors in scope determination of coordination, PP-attachment, identification of antecedent of relative clauses, etc. On the other hand, the second method, which is based on re-parsing with one of observed errors corrected, assesses inter-dependencies among errors by examining which other errors were to be corrected as a result if a specific error was corrected. Experiments show that these two methods are complementary to each other and by being combined, they can provide useful clues as to how to improve given grammar.
11:00--11:30	Clustering Words by Syntactic Similarity improves Dependency Parsing of Predicate-argument Structures Kenji Sagae and Andrew S. Gordon PDF\|BibTeX
	We present an approach for deriving syntactic word clusters from parsed text, grouping words according to their unlexicalized syntactic contexts. We then explore the use of these syntactic clusters in leveraging a large corpus of trees generated by a high-accuracy parser to improve the accuracy of another parser based on a different formalism for representing a different level of sentence structure. In our experiments, we use phrase-structure trees to produce syntactic word clusters that are used by a predicate-argument dependency parser, significantly improving its accuracy.
11:30--12:30	Short Paper Session VI (Chair: Benoît Sagot)
	The chunk as the period of the functions length and frequency of words on the syntagmatic axis Jacques Vergne PDF\|BibTeX
	Chunking is segmenting a text into chunks, sub-sentential segments, that Abney ap-proximately defined as stress groups. Chunking usually uses monolingual resources, most often exhaustive, sometimes partial : function words and punctuations, which often mark beginnings and ends of chunks. But, to extend this method to other languages, monolingual resources have to be multiplied. We present a new method : endogenous chunking, which uses no other resource than the text to be segmented itself. The idea of this method comes from Zipf : to make the least communication effort, speakers are driven to shorten frequent words. A chunk then can be characterized as the period of the periodic correlated functions length and frequency of words on the syntagmatic axis. This original method takes its advantage to be applied to a great number of languages of alphabetic script, with the same algorithm, without any resource.
	Using a maximum entropy-based tagger to improve a very fast vine parser Anders Søgaard and Jonas Kuhn PDF\|BibTeX
	In this short paper, an off-the-shelf maximum entropy-based POS-tagger is used as a partial parser to improve the accuracy of an extremely fast linear time dependency parser that provides state-of-the-art results in multilingual unlabeled POS sequence parsing.
	HPSG Supertagging: A Sequence Labeling View Yao-zhong Zhang, Takuya Matsuzaki and Jun'ichi Tsujii PDF\|BibTeX
	Supertagging is a widely used speed-up technique for deep parsing. In another aspect, supertagging comes to be exploited in other NLP than parsing for utilizing the rich syntactic information given by the supertags. However, the performance of supertagger is still a bottleneck in such applications. To improve the accuracy of supertagging, we investigated the relationship between supertagging and parsing; We started from a sequence labeling view of HPSG supertagging, examining how well a supertagger can do when separated from parsing. Comparison of two types of supertagging model, point-wise model and sequential model, showed that the former model works competitively well despite its simplicity, which indicates the true dependency among supertag assignments is far more complex than the crude first-order approximation made in the sequence model. We then analyzed the limitation of separated supertagging by using a CFG-filter. The results showed that big gains could be acquired by resorting to a light-weight parser.
	Smoothing fine-grained PCFG lexicons Tejaswini Deoskar, Mats Rooth and Khalil Sima'an PDF\|BibTeX
	We present an approach for smoothing treebank-PCFG lexicons with lexical information obtained from a large unannotated corpus, by interpolation of treebank lexical parameters with estimates obtained from unannotated data via the inside-outside algorithm. The PCFG has complex lexical categories, making relative-frequency estimates from a treebank very sparse. This kind of smoothing for complex lexical categories results in improved parsing performance, with a particular advantage in identifying obligatory arguments subcategorized by verbs unseen in the treebank.
	Wide-coverage parsing of speech transcripts Jeroen Geertzen PDF\|BibTeX
	This paper discusses the performance difference of wide-coverage·¶ parsers on small-domain speech transcripts. Two parsers (C\&C CCG·¶ and RASP) are tested on speech of two different domains (parent-child language, and picture descriptions).¶ ¶ The performance difference between the domain-independent parsers and a·¶ domain-trained parsers (MEGRASP) is substantial,·¶ with a difference of about 30 percent point in accuracy. Despite this gap,¶ some of the grammatical relations can still be recovered reliably.¶
Lunch
13:45--14:15	ACL/SIGParse Business Meeting
14:15--15:15	Short Paper Session VII (Chair: Kenji Sagae)
	Interactive Predictive Parsing Ricardo Sánchez-Sáez, Joan-Andreu Sánchez and José-Miguel Benedí PDF\|BibTeX
	This paper introduces a formal framework that presents a novel Interactive Predictive Parsing schema which can be operated by a user, tightly integrated into the system, to obtain error free trees. This compares to the classical two-step schema of manually post-editing the erroneus constituents produced by the parsing system. We have simulated interaction with this system and calculated evalaution metrics, which established that an IPP system results in a high amount of effort reduction for a manual annotator compared to a two-step system.
	Using Treebanking Discriminants as Parse Disambiguation Features Md. Faisal Mahbub Chowdhury, Yi Zhang and Valia Kordoni PDF\|BibTeX
	This paper presents a novel approach of incorporating fine-grained treebanking decisions made by human annotators as discriminative features for automatic parse disambiguation. To our best knowledge, this is the first work that exploits treebanking decisions for this task. The advantage of this approach is that use of human judgements is made. The paper presents comparative analyses of the performance of discriminative models built using treebanking decisions and state-of-the-art features. We also highlight how differently these features scale when these models are tested on out-of-domain data. We show that, features extracted using treebanking decisions are more efficient, informative and robust compared to traditional features.
	Heuristic search in a cognitive model of human parsing John Hale PDF\|BibTeX
	We present a cognitive~process model of human sentence~comprehension based on generalized left-corner parsing. A search heuristic based upon previously-parsed corpora derives garden~path effects, garden~path paradoxes, and the local~coherence effect.
	Dependency Parsing with Energy-based Reinforcement Learning Lidan Zhang and Kwok Ping Chan PDF\|BibTeX
	We present a model which integrates dependency parsing with reinforcement learning based on Markov decision process. At each time step, a transition is picked up to construct the dependency tree in terms of the long-run reward. The optimal policy for choosing transitions can be found with SARSA algorithm. In SARSA, an approximation of the state-action function can be obtained by calculating the negative free energies for the Resticted Boltzmann Machine. The experimental results on CoNLL-X multilingual data show that the proposed model achieves comparable results with the current state-of-the-art methods.
	A generative re-ranking model for dependency parsing Federico Sangati, Willem Zuidema and Rens Bod PDF\|BibTeX
	We propose a framework for dependency parsing based on a combination of discriminative and generative models. We use a discriminative model to obtain a k-best list of candidate parses, and subsequently rerank those candidates using a generative model. We show how this approach allows us to evaluate a variety of generative models, without needing different parser implementations. Moreover, we present empirical results that show a small improvement over state-of-the-art dependency parsing of English sentences.
Coffee Break and Poster Display
	Chair:Philippe Blache
15:45--16:15	Dependency Constraints for Lexical Disambiguation Guillaume Bonfante, Bruno Guillaume and Mathieu Morey PDF\|BibTeX
	We propose a generic method to perform lexical disambiguation in lexicalized grammatical formalisms. It relies on dependency constraints between words. The soundness of the method is due to invariant properties of the parsing in a given grammar that can be computed statically from the grammar.
16:15--16:45	Parsing Directed Acyclic Graphs with Range Concatenation Grammars Pierre Boullier and Benoît Sagot PDF\|BibTeX
	Range Concatenation Grammars (RCGs) are a syntactic formalism which possesses many attractive properties. It is more powerful than Linear Context-Free Rewriting Systems, though this power is not reached to the detriment of efficiency since its sentences can always be parsed in polynomial time. If the input, instead of a string, is a Directed Acyclic Graph (DAG), we show that, for an important subclass, the slightly tuned standard parsing algorithm can still run in polynomial time. Nevertheless, for non-linear grammars, this polynomial parsing time cannot be guaranteed anymore.
16:45--17:00	Closing Remarks