M2 Internship - Reading Comprehension For Multimedia Question Answering *Keywords*: visual question answering, natural language processing, computer vision, machine learning, artificial intelligence. Context The internship takes place in the framework of the MEERQAT project which focuses on Multimedia Question Answering (MQA). This task consists in answering questions grounded in a visual context. For instance, while watching a film, one can wonder /"In which movie did I already see this actress?"/ or /``How many Oscar did she won?''/. It is related to Visual Question Answering (VQA). However, VQA questions relate to the content of the image, such as the color of an object or the number of objects (e.g. `/`What color is her dress?/''), while MQA focuses on finding answers in text, but with the help of images associated with the questions. Research problem Question Answering is usually split into two steps: Information Retrieval for selecting a restricted set of documents or passages from a large collection of documents and Reading Comprehension for extracting answers to the questions in the retrieved documents. The internship will focus on the second step, relying on the work already done in the MEERQAT project for the multimedia retrieval of documents. See the online offer for more details https://www.meerqat.fr/wp-content/uploads/2021/10/2022_meerqat_internship_mqa-optim.pdf. Objectives The main objective of the internship is to define, implement, and evaluate methods, in the context of MQA, for taking into account the information brought by images in the Reading Comprehension task. Two main research directions will be considered in this context: - a late fusion approach relying on the results of the multimedia Information Retrieval step to rerank candidate answers with respect to images; - a more early fusion approach integrating images in the reader to allow contextual disambiguation. Internship conditions The internship will be supervised by Paul Lerner along with Olivier Ferret and Camille Guinaudeau and will take place at LISN. LISN is an interdisciplinary laboratory resulting from the merge of LIMSI and LRI in 2021. It is associated with CNRS and Université Paris-Saclay and includes 16 research teams and 380 people. The intern will be located at /bât 507, Rue du Belvedère, F-91405 Orsay cedex/. - Remuneration: around 600¤ along with the refund of half the Navigo (public transport) card. - Starting date: the internship is expected to start from March 2022 but could begin earlier. - Duration: 5-6 months. Requirements We are looking for an M2 student in Natural Language Processing, Computer Vision or Machine Learning.The intern is expected to be proficient in programming, especially in the Python language, and to have already worked under Linux. They should also have experience with a deep learning framework, preferably PyTorch. Application Please send a resume along with a cover letter (in French or English) and grade transcripts for the last two years to Paul Lerner at paul.lerner@lisn.upsaclay.fr. Examples of projects (e.g. via GitHub) is a plus.