Machine translation has made great progress in recent years thanks to deep neural networks [1,2,3]. A conventional neural machine translation (NMT) system uses a limited vocabulary of `tokens' and its decoder generates a token in the vocabulary at each time step. The `tokens' of current machine translation systems can be words, characters [4] or subwords such as byte pair encodings (BPEs) [5]. The latter have been particularly effective to deal with out-of-vocabulary words and generally lead to state-of-the-art results. However, it is not clear how many units1 should be kept for a particular MT task and which is the optimal granularity (characters, subwords, words), if any. The goal of this internship is to investigate approaches that provide models with several views (segmentations) of the text to strengthen their robustness. This is particularly important for processing noisy data such as user generated content (UGC - e.g., user reviews of hotels or restaurants). Such a multiscale neural machine translation model should take into account these different segmentation granularities at both training and decoding stages. We also want the proposed method to be applicable to the latest state-of-the-art NMT based on transformer networks [3]. Requirements - Student at Master (research-oriented) or PhD level. - Knowledge of deep learning as applied to NLP. - Good coding skills, including at least one of the major deep learning toolkits (preferably Pytorch). References. [1] Sequence to Sequence Learning with Neural Networks. Ilya Sutskever, Oriol Vinyals, Quoc V. Le. NIPS 2014. [2] Neural Machine Translation by Jointly Learning to Align and Translate. Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio. ICLR q2015. [3] Attention Is All You Need. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin. NIPS 2017. [4] Jason Lee, Kyunghyun Cho, and Thomas Hofmann. 2017. Fully character-level neural machine translation without explicit segmentation. TACL 2017 [5] Neural machine translation of rare words with subword units. Rico Sennrich, Barry Haddow, and Alexandra Birch. ACL,2016. [6] Improving Neural Machine Translation by Incorporating Hierarchical Subword Features. Makoto Morishita, Jun Suzuki* and Masaaki Nagata. COLING 2018.5 [7] Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates. Taku Kudo. ACL 2018.6 [8] Google's neural machine translation system: Bridging the gap between human and machine translation. Yonghui Wu, Mike Schuster, et al. arXiv preprint arXiv:1609.08144 2016. [9] Neural Lattice-to-Sequence Models for Uncertain Inputs. Matthias Sperber, Graham Neubig, Jan Niehues, Alex Waibel. EMNLP 2017. [10] On Using Monolingual Corpora in Neural Machine Translation. Caglar Gulcehre, Orhan Firat, Kelvin Xu, Kyunghyun Cho, Loic Barrault, Huei-Chi Lin, Fethi Bougares, Holger Schwenk, Yoshua Bengio. arXiv preprint:1503.03535. [11] Optimally Segmenting Inputs for NMT Shows Preference for Character-Level Processing. Julia Kreutzer, Artem Sokolov. arXiv preprint:1810.01480. 2018 Start Date asap Duration 5-6 months Application instructions To apply, please send a mail and CV to matthias.galle@naverlabs.com , marc.dymetman@naverlabs.com and laurent.besacier@univ-grenoble-alpes.fr