Internship Proposal: Analyzing Context-Sensitive Representations in Large Language Models Duration: 4 to 6 months Starting date : to be discussed Master's student in Computer Science (with interests in NLP, representation learning, and language model interpretability) Location and supervision: Josiane Mothe, Benjamin Piwowarski and Victor Morand (PhD student). Contact Josiane.Mothe@irit.fr / Benjamin.Piwowarski@isir.upmc.fr Keywords: Large Language Models, Representation Learning, Explainable AI, Entity Disambiguation After the Internship: Possibly a PhD grant, depending on the results and possibilities Overview This internship aims to investigate how large language models (LLMs) represent the same word or concept differently depending on contextual cues, like regional linguistic variations or semantic nuances. The project focuses on comparing and disambiguating these contextualized embeddings for specific entities or words across different languages (including multi-token concepts), such as the representations of "Paris" in different languages or contexts (e.g. associating embeddings of "City of Lights", "Ville Lumière" "Capital of France" to Paris). More information: contact us! Applicants should send their CV (including transcript of marks) References: Victor Morand, Josiane Mothe, Benjamin Piwowarski, On the Representations of Entities in Auto-regressive Large Language Models, Submitted E. Hernandez, A. S. Sharma, T. Haklay, K. Meng, M. Wattenberg, J. Andreas, Y. Belinkov, and D. Bau., Linearity of Relation Decoding in Transformer Language Models, Feb. 2024 J. Niu, A. Liu, Z. Zhu, and G. Penn. What does the Knowledge Neuron Thesis Have to do with Knowledge? In The Twelfth International Conference on Learning Representations, Oct. 2023. M. Geva, J. Bastings, K. Filippova, and A. Globerson, Dissecting Recall of Factual Associations in Auto Regressive Language Models, Oct. 2023. K. Meng, D. Bau, A. Andonian, and Y. Belinkov. Locating and Editing Factual Associations in GPT. Advances in Neural Information Processing Systems, 35:17359-17372, Dec. 2022. Xinyu Zhang and Jing Lu and Vinh Q. Tran and Tal Schuster and Donald Metzler and Jimmy Lin. Tomato, Tomahto, Tomate: Measuring the Role of Shared Semantics among Subwords in Multilingual Language Models, Arxiv Preprint