*Title: Grapheme-to-phoneme conversion adaptation using conditional random fields* *Description:* Grapheme-to-phoneme conversion consists in generating possible pronunciations for an isolated word or for a sequence of words. More formally, this conversion is a transliteration of a sequence of graphemes, i.e., letters, into a sequence of phonemes, symbolic units to represent elementary sounds of a language. Grapheme-to-phoneme converters are used in speech processing - either to help automatic speech recognition systems to decode words from a speech signal - or as a mean to explain speech synthesizers how a written input should be acoustically produced. A problem with such tools is that they are trained on large and varied amounts of aligned sequences of graphemes and phonemes, leading to generic manners of pronouncing words in a given language. As a consequence, they are not adequate as soon as one wants to recognize or synthesize specific voices, for instance, accentuated speech, stressed speech, dictating voices versus chatting voices, etc. [1]. While multiple methods have been proposed for grapheme-to-phoneme conversion [2, 3], the primary goal of this internship is to propose a method to adapt grapheme-to-phoneme models which can easily be adapted under conditions specified by the user. More precisely, the use of conditional random fields (CRF) will be studied to model the generic French pronunciation and variants of it [4]. CRFs are state-of-the-art statistical tools widely used for labelling problems in natural language processing [5]. A further important goal is to be able to automatically characterize pronunciation distinctive features of a given specific voice as compared to a generic voice. This means highlighting and generalizing differences that can be observed between two sequences of phonemes derived from a same sequence of graphemes. Results of this internship would be integrated into the speech synthesis platform of the team in order to easily and automatically simulate and imitate specific voices. *Technical skills:* C/C++ and a scripting language (e.g., Perl or Python) *Keywords:* Natural language processing, speech processing, machine learning, statistical learning *Contact:* Gwénolé Lecorvé (gwenole.lecorve@irisa.fr) *References:* [1] B. Hutchinson and J. Droppo. Learning non-parametric models of pronunciation. In Proceedings of ICASSP, 2011. [2] M. Bisani and H. Ney. Joint-sequence models for grapheme-to-phoneme conversion. In Speech Communication, 2008. [3] S. Hahn, P. Lehnen, and Ney H. Powerful extensions to crfs for grapheme to phoneme conversion. In Proceedings of ICASSP, 2011. [4] Irina Illina, Dominique Fohr, and Denis Jouvet. Multiple pronunciation generation using grapheme-to-phoneme conversion based on conditional random fields. In Proceedings of SPECOM, 2011. [5] John D. Lafferty, Andrew McCallum, and Fernando C. N. Pereira. Conditional random fields: probabilistic models for segmenting and labeling sequence data. In Proceedings of ICML, 2001.