Scientific data curator - 24 months - Grenoble, France Starting date: January 03, 2022 at the earliest Duration: full-time position for 24 months (with a possibility of reappointment) Deadline for Applications: November 30th, 2021 Location: The position will be based in Grenoble, France Remote work will be possible, eg: 1 day/week Keywords: Corpus, digital humanities, data collection Context The NanoBubbles ERC Synergy project's objective is to understand how, when and why science fails to correct itself. The project focuses on claims made within the field of nanobiology. Project members combine approaches from the natural sciences, computer science, and the social sciences and humanities (Science and Technology Studies) to understand how error correction in science works and what obstacles it faces. For this purpose, we aim to trace claims and corrections through various channels of scientific communication (journals, social media, advertisements, conference programs, etc.) via both qualitative and digital methods. Your contribution to the main project will be to advise on, run and/or maintain software and systems that support activity related to collection, analysis, storage and presentation of textual data and metadata. This is an exciting opportunity to join a highly interdisciplinary research team working at the forefront of Science and Technology Studies, Digital Humanities, ethics of/in research, and nanoscience. You will: - Build corpora with data collected from heterogeneous sources (eg.: bibliographic databases like Scopus or Dimensions, full-text databases like ISTEX or open archive repositories, social networks, post-publication peer-review platforms and other online tools allowing annotations and comments...) - Process and transform data, organize data flow to database, create formal links between datasets. - Curate metadata - Develop scripts for data collection via APIs (preferably: Python, SQL, Java, R) and web scraping (e.g., HtmlUnit, Selenium) - Contribute to the development of a common vocabulary and map it to existing ontologies - Implement and manage various software pipelines to support data analysis and text mining. - Help the other team members to run experiments and validate their choices - Document the data lifecycle and update the data management plan You will work closely with PhD students, interns and researchers of the ERC project. You will also benefit from the skills and the research environment of 2 research units: the LISIS (http://umr-lisis.fr) and the LIG (https://www.liglab.fr/en). Qualifications Master's degree in data science, digital humanities or computational social sciences. Very good knowledge of English Qualifications in corpus linguistics tools, corpus-based research, quantitative and qualitative data analysis, natural language processing or computational linguistics are deemed as a plus. Instructions for applying Applications are expected until November 30th, 2021. Please send CV + letter/message of motivation + grades from previous education + references for potential letter(s) of recommendation to: Frederique Bordignon (frederique.bordignon@enpc.fr), Cyril Labbé (cyril.labbe@imag.fr), and Cyrus Mody (c.mody@maastrichtuniversity.nl).