Post-doctoral position: Event-based multi-document summarization for building timelines Keywords: information extraction, natural language processing, temporal analysis, events, timelines Location: LIMSI-CNRS, Orsay (Paris), France. Context Among other objectives, national funded project Chronolines aims at creating semi-automatic timelines from a query, based on a collection of newswire papers. Given a user-defined topic and a set of texts, the task consists in extracting the most important events concerning the topic and to present them to the user for validation. The ideal output would then be a set of brief descriptions of events, together with the dates of these events. Work on this project already resulted in a few publications, among which a paper at ACL 2012 on salient dates extraction, that the candidate can refer to for more details [1]. The candidate would be integrated into this project, working in the project team on some of the following issues: * Aggregation/Summarization: how to choose/generate a brief description of each event, from a set of relevant sentences. * Evaluation: what metrics, what methodology for objective evaluation. * Granularity: as the time unit for our salient date algorithm is the day, how to decide that several topic-related important events occurred on the same day or, inversely, that an important event lasted more than one day. * Relationship: how to use the big collection of articles to extract some relationship between events? Required skills The candidate should hold a PhD in Natural Language Processing and/or Information Retrieval, and be able to: * Work with texts (interest in linguistic issues and how to deal with them) * Work with a lot of texts (good programming skills, big corpora management, information aggregation, ability to forget about linguistic issues when we need to) * Learn from (imperfect) references (ability to observe and generalize, machine learning skills) * Work with tools used and built by the team (in Linux, Java, perl...) Contacts : Xavier.Tannier[at]limsi.fr Veronique.Moriceau[at]limsi.fr Reference: [1] Rémy Kessler, Xavier Tannier, Caroline Hagège, Véronique Moriceau, André Bittar. Finding Salient Dates for Building Thematic Timelines. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (ACL 2012). Jeju Island, Republic of Korea, July 2012. © Association for Computational Linguistics.