*Research engineer in NLP at IRIT, Toulouse (France) - ANR AnDiAMO* Data and software support for robust discourse parsing and its application - Contract duration: 12 months - Starting date: June 2022 (flexible) - Location: IRIT, Université P. Sabatier (Toulouse III) - Remuneration: 2035-2630 euros, gross salary, depending on experience - Application deadline: the position will be open until fulfilled - Send application by email to chloe.braud@irit.fr - More information at: https://pagesperso.irit.fr/~Chloe.Braud/andiamo/ *Natural Language Processing *(NLP) is a domain at the frontier of AI, computer science and linguistics, aiming at developing systems able to automatically analyze textual documents. Within NLP, *d**iscourse parsing* is a crucial but challenging task: its goal is to produce structures describing the relationships (e.g. *explanation, contrast*...) between spans of text in full documents, allowing for making inference on their content. Developing high-performing and robust discourse parsers could help to improve downstream applications such as automatic summarization or translation, question-answering, chat bots. However, current performance are still low, mainly due to the lack of annotated data. In order to develop robust discourse parsers within the *AnDiAMO* project, we want to explore multi-objective settings, where the goal is ultimately to perform a discourse analysis, but relying on another related objective such as performing well on another task (e.g. morphological, syntactic or temporal analysis), or an application (e.g. sentiment analysis or argument mining). We will also explore the issues of cross-language and cross framework learning. The hired engineer will be in charge of: - *Set up evaluation*: set up pipeline systems for evaluation of downstream applications (e.g. sentiment analysis, question-answering, argument mining...) ; investigating different ways of using the discourse parsers outputs to test the impact of discourse information. - *Corpus curation*: collect datasets for several tasks (e.g. POS tagging, syntactic parsing, temporality, modality...) and pre-process them ; - *Corpus harmonization*: collect existing discourse corpora and harmonize them, following the format used for the DisRPT shared task (https://sites.google.com/georgetown.edu/disrpt2021/home?authuser=0) The position is funded by the ANR AnDiAMO project, for which postdocs and master interns will also be recruited. Collaborations are planned with researchers in Toulouse, Grenoble, Nancy and Munich. The hired person will be part of the MELODI team at IRIT, participating in team and project meetings, and co-authoring articles. *### Profile* - Master or PhD degree in computer science or computational linguistics - Interest in language technology / NLP The recruited engineer should have good developing skills. Knowledge in machine learning would be a plus. In addition to these tasks, it will be possible to investigate other paths, such as building multi-task learning architectures or testing few-shot learning strategies, according to the interests of the candidate. *### Application* Please send a CV and a few lines explaining your interest for the position to chloe.braud@irit.fr