Graph-based learning for networks of texts Position type: Post-doctoral Fellow Functional area: Lille (Villeneuve d'Ascq) Research theme: Perception, cognition, interaction Project: MAGNET Scientific advisor: marc.tommasi@inria.fr, pascal.denis@inria.fr HR Contact: sandrine.catillon@inria.fr Application deadline: 30/08/2015 About Inria and the job Established in 1967, Inria is the only public research body fully dedicated to computational sciences. Combining computer sciences with mathematics, Inria's 3,500 researchers strive to invent the digital technologies of the future. Educated at leading international universities, they creatively integrate basic research with applied research and dedicate themselves to solving real problems, collaborating with the main players in public and private research in France and abroad and transferring the fruits of their work to innovative companies. The researchers at Inria published over 4,450 articles in 2012. They are behind over 250 active patents and 112 start-ups. The 180 project teams are distributed in eight research centers located throughout France. The research center Inria Lille - Nord Europe, established in 2008, has 360 people including 305 scientists in 16 research teams. Known for his strong involvement in the socio-economic development in the North - Pas-de-Calais, the research center Inria Lille - Nord Europe pursues a close approach with large companies and PME. By promoting synergies between research and industry, Inria participates in the transfer of skills and expertise in digital technologies, and provides access to the best European and international research to the benefit of innovation and companies especially in the regions. Job offer description The candidate will contribute possibly to two main objectives of the Magnet Group, related to graph-based learning and NLP problems. While graph-based approaches have been investigated for networked text data, like Twitter [Spe+11], we are interested in studying their relevance for more general problems, wherein no a priori graph is given. The first challenge lies in framing a particular task as a network problem and constructing a "good" similarity graph for it. Designing adequate similarity measures for problems is not trivial given that features typically live in different spaces (binary, discrete, continuous). We are also interested in studying how to obtain a similarity that is tailored to the task objective. This question is related to the questions of similarity learning [BHS13] and representation learning [BCV 2013] which have rarely been considered in the context of graph-based learning [DTC, 2010], [WRC 2008]. Many NLP tasks (e.g., POS tagging, syntactic parsing, coreference resolution) have been modeled as structured output prediction problems. Consequently, how to best combine structured output and graph-based ML approaches is another challenge that we intend to address. We will initially investigate this question within a semi-supervised context, concentrating on graph-based regularization and label propagation methods ([BNS06], [ZG02]). Within such approaches, labels are typically binary or they correspond to small finite set. Our objective is to explore how one propagates an exponential number of structured labels (like a sequence of tags or a tree) through graphs. Recent attempts at blending structured output models with graph-based models are investigated in [SPP10; DP11]. Skills and profile The recruited candidate will join the Magnet research group (https://team.inria.fr/magnet/). The main focus of Magnet is on the design of original machine learning methods for networked data, coming in the form of vectorial data or natural language texts. Our targeted applications include browsing, monitoring and recommender systems, and more broadly information extraction in information networks. Under the direct command responsibility of the team leader, he/she will be in charge of designing original graph-based classification algorithms geared towards Natural Language Processing (NLP) applications. The candidate should hold a PhD in computer science, machine learning, natural language processing, or a related field. The candidate is also expected to have maintained a good publication record during her PhD, ideally with publications in the main ML and/or NLP conferences (such as ICML, NIPS, KDD, SDM, ICDM, ACL, NAACL, EMNLP, COLING) or journals (such as Machine Learning Journal, JMLR, Computational Linguistics). The candidate is also expected to have strong programming skills in C, C++, Matlab/Octave, Python. Previous experience in Natural Language Processing is a plus. Benefits - Possibility of taking French courses - Help for housing - Financial support from Inria to catering and transportation expenses. - Scientific Resident card and help for visa Additional information - Duration : 12 months (can be extended to 16 months) - Starting date of the contract : 01/11/2015 - Salary : 2 621 ¤ Application procedure: Applicants must apply online from the Inria Web site at the following address: (link will be by SRH) Security and defense procedure: In the interests of protecting its scientific and technological assets, Inria is a restricted-access establishment. Consequently, it follows special regulations for welcoming any person who wishes to work with the institute. The final acceptance of each candidate thus depends on applying this security and defense procedure.