*About Vivoka* Founded in 2015 and awarded two CES Innovation Awards, Vivoka (https://vivoka.com/en/) has created and sells the Voice Development Kit (VDK), the very first solution allowing a company to design a voice interface in a simple, autonomous, and quick way. Moreover, this interface is embedded: it can be deployed on devices without an Internet connection and fully preserves privacy. Accelerated by the COVID-19 health crisis and the need for "no-touch" interfaces, Vivoka is now optimizing this technology by developing its own speech and language processing solutions that are able to compete with the most efficient current technologies. The internship would be carried out as part of Vivoka's R&D team. The interns will benefit from the startup spirit of Vivoka, where they will interact with the researchers and Ph.D. students of the R&D team, and the engineers responsible for integrating their results into the VDK. Internship Requirements: - M2 in Computer Science with a specialization in Machine Learning (ML) or Natural Language Processing (NLP) - Prior knowledge and/or experience with ML/NLP. - Experience with Python programming and frameworks like PyTorch. 1. Robust Dialogue State Tracking for Dialog Management in Conversational AI Context Conversational systems improve user experience by steering interactions to understand users' needs and respond by providing informed answers, assistance, invoking services, etc. Unlike non-task-oriented dialogue systems that focus on open-domain conversations, such as chit-chats, task-oriented conversational systems enable users to accomplish certain tasks using the information provided during conversations. One of the critical aspects of conversational systems is the design of dialogue management that allows robust, intelligent, engaging conversations [1, 2, 3]. The focus of this internship is dialogue management in task-oriented conversational systems. In task-oriented dialogue systems, the dialogue state is the component of a dialogue manager that serves as a summary of the entire conversation up to the present turn. It maintains all the essential information that the system needs to give informed responses to the user's queries. This information comprises mainly the user's intents (e.g. flight_booking), slots, i.e. information needed to fulfill the intent (e.g. departure and arrival cities), and dialogue acts, i.e. hidden actions in user utterances to indicate their specific communicative function (e.g. request, statement, etc.) [3]. The dialogue states are estimated and tracked by the Dialogue State Tracking (DST) model [4]. Based on the dialogue states, the conversational agent generates subsequent actions to sustain the ongoing conversation. In real-world conversations, the range of potential values for slots is often dynamic and unbounded, such as movie_titles or usernames. Consequently, in recent years, there has been an active focus on open-vocabulary approaches to DST [3]. These approaches involve estimating the possible values for slots from the ongoing conversation and language understanding results, without relying on a predefined set of categories. This research area represents a critical advancement toward DST with zero-shot generalization, which means that adding new intents and slots can be achieved without the need for collecting new data or extensive retraining. This internship aims to explore dialogue management in conversational systems with a particular focus on robust DST approaches that can achieve few-shot or zero-shot generalization. In real use cases, the disfluent nature of spontaneous conversations poses an additional set of challenges for Dialogue Management. The internship will focus on the challenges that are encountered while building robust task-oriented DST approaches meant for real-world applications of conversational systems. Objectives and Expected Outcomes: - Perform a literature review of Dialogue Management - Implement a state-of-the-art Dialogue State Tracking approach in PyTorch - Improve the implemented DST approach to perform few/zero-shot generalization - Perform experiments to examine the challenges with real-world conversations for dialogue management - Perform experiments to examine the generalizability of the implemented DST approach *References:* 1. M. McTear, Z. Callejas, and D. Griol, "The Conversational Interface: Talking to Smart Devices" https://link.springer.com/book/10.1007/978-3-319-32967-3, 1st ed. Springer Publishing Company, Incorporated, 2016. 2. Z. Zhang, M. Huang, Z. Zhao, F. Ji, H. Chen, and X. Zhu, "Memory-augmented dialogue management for task-oriented dialogue systems" https://dl.acm.org/doi/abs/10.1145/3317612, ACM Transactions on Information Systems (TOIS), 2019. 3. H. Brabra, M. Báez, B. Benatallah, W. Gaaloul, S. Bouguelia and S. Zamanirad, "Dialogue Management in Conversational Systems: A Review of Approaches, Challenges, and Opportunities" https://ieeexplore.ieee.org/document/9447005, in IEEE Transactions on Cognitive and Developmental Systems, vol. 14, no. 3, pp. 783-798, 2022 4. Jason Williams, Antoine Raux, Deepak Ramachandran, and Alan Black. 2013. "The Dialog State Tracking Challenge" https://aclanthology.org/W13-4065/. In Proceedings of the SIGDIAL 2013 Conference, pages 404-413, Association for Computational Linguistics, 2013. Please submit your applications to tulika.bose@vivoka.com or firas.hmida@vivoka.com. Please feel free to share this call for applications with any interested students.