Machine translation has made great progress in recent years thanks to deep neural networks [1,2,7]. However, current machine translation systems make a simplifying assumption that the translation of a given sentence does not depend on the other neighboring sentences. Yet, we know that extended context can prevent mistakes in ambiguous cases and improve translation coherence. In the case of machine translation of user generated content (e.g., user reviews of hotels or restaurants), the source data is often highly contextual and structured. Several reviews can be associated to the same Point of Interest (POI), POIs can have specific features (e.g., their name or their location) and can also be organized into graphs according to their similarity. The goal of this internship is to introduce context-aware neural machine translation models that do not process sentences in isolation but leverage their context (history of sentences already translated, meta information associated to a POI, POI graph structure). The machine translation experiments will be done on several language pairs with a POI database gathering more than 100 million POIs. Requirements ● Student at Master (research-oriented) or PhD level. ● Knowledge of deep learning as applied to NLP. ● Good coding skills, including at least one the major deep learning toolkits (preferably Pytorch). References. [1] Sequence to Sequence Learning with Neural Networks. Ilya Sutskever, Oriol Vinyals, Quoc V. Le. NIPS 2014. [2] Neural Machine Translation by Jointly Learning to Align and Translate. Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio. ICLR 2015. [3] Conversational Analysis using Utterance-level Attention-based Bidirectional Recurrent Neural Networks. Chandrakant Bothe, Sven Magg, Cornelius Weber, Stefan Wermter. Interspeech 2018 [4] End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Language Understanding. Yun-Nung Chen, Dilek Hakkani-Tür, Gokhan Tur, Jianfeng Gao, and Li Deng. Interspeech 2016 [5] An Efficient Approach to Encoding Context for Spoken Language Understanding. Raghav Gupta, Abhinav Rastogi, Dilek Hakkani-Tur. Interspeech 2018 [6] Multi-Timescale Long Short-Term Memory Neural Network for Modelling Sentences and Documents. Pengfei Liu, Xipeng Qiu , Xinchi Chen, Shiyu Wu, Xuanjing Huang. EMNLP 2015. [7] Attention Is All You Need. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin. NIPS 2017. [8] Context-Aware Neural Machine Translation Learns Anaphora Resolution. Elena Voita, Pavel Serdyukov, Rico Sennrich, Ivan Titov. EMNLP 2018. [9] Improving the Transformer Translation Model with Document-Level Context. Jiacheng Zhang, Huanbo Luan, Maosong Sun, Feifei Zhai, Jingfang Xu, Min Zhang and Yang Liu. EMNLP 2018. [10] Learning to Remember Translation History with a Continuous Cache. Zhaopeng Tu, Yang Liu, Shuming Shi, Tong Zhang. TACL 2018. [11] Document Context Neural Machine Translation with Memory Networks. Sameen Maruf, Gholamreza Haffari. ACL 2018. [12] Document-Level Neural Machine Translation with Hierarchical Attention Networks. Lesly Miculicich, Dhananjay Ram, Nikolaos Pappas, James Henderson. EMNLP 2018 [13] Improving the Transformer Translation Model with Document-Level Context. Jiacheng Zhang, Huanbo Luan, Maosong Sun, FeiFei Zhai, Jingfang Xu, Min Zhang, Yang Liu. EMNLP 2018 Start Date asap Duration 5-6 months Application instructions To apply, please send a mail and CV to ioan.calapodescu@naverlabs.com, alexandre.berard@naverlabs.com and laurent.besacier@univ-grenoble-alpes.fr