Research Internship proposal Internship: Large Language Models for Knowledge Graphs Validation Duration: 6 months Location: IRIT, 118 route de Narbonne, Toulouse, France Advisors: Farah Benamara, IRIT, Toulouse University and IPAL-CNRS, Singpoure farah.benamara@irit.fr Gratification: 630¤/mouthn Context The Institut de Recherche en informatique de Toulouse (IRIT-Toulouse University) has a 6 months position oOer within the ANR PRCI project QualityOnt (https://helios2.mi.parisdescartes.fr/~sseifedd/qualityont/) on using NLP for knowledge graph validation from textual data (social media and news) in a multilingual context. The project aims at generating high-quality knowledge graph (KG) for emergent English, French and German trends during sanitary crises situations through (1) a general view of how facts about the pandemic evolve across time and languages, and (2) a high-quality evaluation of these facts in enriched knowledge graphs to support further analysis. The methodology and results of QualityOnt were designed to be generic enough to ensure their reusability in other future sanitary crises situations. Objectives The position is funded by an ANR PRCI-Project involving academics from NLP (Toulouse Univ.- IRIT), Database quality (Univ. de Paris-LIPAD) and semantic web/knowledge graph (Univ. Lubeck-Germany). The candidate will start from a knowledge graph (KG) already developed by within the project will be in charge of developing NLP techniques for KG validation, in particular: (1) Write a state of the art on extracting triples (facts) and validate them from either unstructured or semi-structured data, (2) Identify existing tools/methods and re-implement them (to be used as baselines), (3) Propose new solutions based on LLMs to account for negation and modality at the sentence level. References: [1] Morteza Kamaladdini Ezzabady, Frédéric Ieng, Hanieh Khorashadizadeh, Farah Benamara, Sven Groppe, Soror Sahri: Towards Generating High-Quality Knowledge Graphs by Leveraging Large Language Models. NLDB (1) 2024: 455-469 [2] Hanieh Khorashadizadeh, Fatima Zahra Amara, Morteza Kamaladdini Ezzabady, Frédéric Ieng, Sanju Tiwari, Nandana Mihindukulasooriya, Jinghua Groppe, Soror Sahri, Farah Benamara, Sven Groppe: Research Trends for the Interplay between Large Language Models and Knowledge Graphs. CoRR abs/2406.08223 (2024) To apply, send your CV+grades (relevés de notes) to farah.benamara@irit.fr by December 20th 2024