Exploitation of high level syntactic models for Natural Language Processing (ALPAGE) Type de poste : Post-doctorant Lieu de travail : Rocquencourt Thème de recherche : Systèmes symboliques Projet : ALPAGE Environnement INRIA project-team ALPAGE works in the field of Natural Language Processing [NLP]. Guided by theoretical results in linguistics and formal language theory, we develop various tools (such as parsers) and French linguistic resources (grammars, lexicon) and apply them to process corpora. Missions One of the tools developed by ALPAGE is the DyALog system, which may be used to build parsers for various grammatical formalisms, including Tree Adjoining Grammars (TAGs). DyALog implements theoretical results issued from our works on several extensions of Push-Down Automata (PDA). The first objective of this proposal is to complete the DyALOg compilation phase to integrate recent results on Thread Automata. These automata generalize all the kinds of automata currently covered by DyALog, hence suggesting a more uniform processing of all the grammatical formalisms known to DyALog. These automata also open the way for a complete handling of Mildly Context Sensitive [MCS] formalisms, a large and linguistically motivated class of formalisms that, in particular, includes local Multi-Component TAGs (MC-TAGs) . The basic idea of MCTAGs is to replace grammatical productions given as elementary parse trees by productions given as sets of coordinated elementary parse trees, providing more flexibility to handle deep extraction phenomena and scoping phenomena. Once done the integration of Thread Automata within DyALog, the second objective is the exploration of MC-TAGs, in relation with the notion of meta-grammars. Meta-grammars provide an abstract level of syntactic description that is followed by a phase of expansion towards a target formalism like TAGs, and, in the future, MC-TAGs. The use of meta-grammars aims to facilitate the development of grammars for new formalisms. We already have a French meta-grammar FRMG whose TAG expansion has been tested at large scale and that ensures a good linguistic coverage. However, the evolution of FRMG is currently hindered by the limitations of TAGs, motivating the switch to MC-TAGs through the meta-grammar level. The proposal includes the following tasks: - Implementing MCTAGs into DyALog, through the notion of Thread Automata. - Extending MGCOMP in order to produce MCTAGs rather than simply TAGs. The changes are minimal. - Transforming and extend the current French MG FRMG to handle some phenomena through MCTAG. - Running some experiments on corpus with the resulting parser. Links: ALPAGE: http://alpage.inria.fr/ references: http://alpage.inria.fr/biblio with keywords thread, DyALog, metagrammar demo FRMG and DyALog: http://alpage.inria.fr/parserdemo Compétences et Profil Competences in parsing for NLP (algorithmic, techniques, formalisms). Good practice of a Logic Programming language, like Prolog. Familiarity with Linux environments. Informations complémentaires http://alpage.inria.fr/stages.en.html Contact : Eric.De_La_Clergerie@inria.fr