Position: post-doctoral position at IRIT (Toulouse, France) - 18 months (possible extension based on outcomes) Project: GeoR2LLM - Knowledgeable and Multimodal Geographic Large Language Models Grounded with Reasoning and Retrieval European funding Consortium: University of Toulouse-IRIT (FR, Principal Investigator), University of the Basque Country (SP), University of Leeds (UK), University of Aalto (FIN), Ghent University (BE). Keywords: Large Language Models, Knowledge Representation, Probing, Natural Language Processing, Information Retrieval Context: The post-doc position is envisioned within the CHIST-ERA GeoR2LLM project which aims to build a new framework for designing geographic LLMs as backbones of next-generation Artificial Intelligence (AI)-supported geographic information systems. The underlying paradigm is marked by two distinctive and synergic dimensions through the "knowledgeability" and "multimodality" of Large Language Models grounded with reasoning and retrieval. Objectives: Within this post-doc position, we aim at decoupling geographical knowledge from language memorization to tackle aforementioned limitations. Specifically, we aim to empower LLM parametric knowledge with geographic knowledge about the geometry of structures (e.g., line, polyline, and polygons), core spatial concepts (e.g., distance, scale, and direction), operations (e.g., rotation), and relations (e.g., parthood, neighborhood), that native LLM's are not inherently endowed with. The main-stream of LLMs' development is to pre-train on general language data and fine-tune or instruct on domain-specific tasks. Previous work on probing LLMs revealed, however, that it is unclear whether LLMs acquire effective factual knowledge and underlying properties or simply memorise associations (Geva et al., 2021). Recent work studying the internal structure of LLMs suggest that even though transformers are complex, linear relationships can be linearly decoded from the representations of entities and their relations (Li et al., 2021; Hernandez et al., 2023). A recent work showed that this property can be generalized to a variety of factual and commonsense relations using first-order approximation to the LM from a single prompt (Hernandez et al., 2024). We aim to build upon this state-of-the-art research to first explore where geographic subject information is located, how it is computed by LMs to resolve geographic relations and operations and then design appropriate probing functions by relying on geographic common-sense knowledge. The ultimate goal is to design suited instruction-tuning strategies to develop knowledgeable geographic LLMs. Bibliography Geva et al., 2021. Mor Geva, Roei Schuster, Jonathan Berant, and Omer Levy. Transformer feed-forward layers are key-value memories. arXiv preprint arXiv:2012.14913, 2020 Li et al., 2021. Belinda Z Li, Maxwell Nye, and Jacob Andreas. Implicit representations of meaning in neural language models. arXiv preprint arXiv:2106.00737, 2021 Hernandez et al., 2023. Evan Hernandez, Belinda Z Li, and Jacob Andreas. Inspecting and editing knowledge representations in language models. arXiv preprint arXiv:2304.00740, 2023 Hernandez et al., 2024. Evan Hernandez, Arnab Sen Sharma, Tal Haklay, Kevin Meng, Martin Wattenberg, Jacob Andreas, Yonatan Belinkov, David Bau. Linearity of Relation Decoding in Transformer Language Models. ICLR=E2=80= =9924 Application process The position is available immediately with some flexibility on the start date, but no later than April 2025. We welcome applications starting immediately, until 30th March 2025. Interviews will be planned starting from January 2025 until the position is filled. Interested candidates can send an application with the following documents by email to Lynda Tamine (Lynda.tamine@irit.fr) and José Moreno (jose.moreno@irit.fr): - Curriculum Vitae (CV) - A cover letter highlighting relevant background and motivation for applying; - Contact information of academic referees and/or recommendation letters. More Information: please check the attached document or feel free to approach us if the document is missing (use to happened in some mailing list) - Lynda Tamine (Lynda.tamine@irit.fr) and José Moreno (jose.moreno@irit.fr).