Upcoming Project – Vision-Language Models

(Ongoing recruitment for a PhD student and Postdoc)

Vision-language models integrate computer vision and natural language processing techniques to process and generate information that combines both visual and textual modalities, enabling a more profound understanding of the content within images and videos. While vision-language models exhibit promising potential, they are currently in early stages of development. Effective integration of both modalities (vision and language), and aligning visual and text embeddings into a cohesive embedding space, continues to pose significant challenges.

In this project, we will conduct basic research and methods development towards designing efficient vision-language models, exploring their applications in computer vision and analysing the consequential societal impact. Through our research, we aim to contribute significantly towards learning effective representations combining text, image and video data, potentially benefiting fields including surveillance and healthcare.

The project is a part of the Beijer Laboratory for Artificial Intelligence Research, funded by Kjell and Märta Beijer Foundation. The Beijer Laboratory for Artificial Intelligence Research was established in 2023 at Uppsala University with an ambition to grow activities within the subject of AI, focusing on applications in the life sciences and questions related to societal development.

Supervisor: Ekta Vats

At the Division of Systems and Control, Department of Information Technology, UU