About us
Espace utilisateur
Education
INSTN offers more than 40 diplomas from operator level to post-graduate degree level. 30% of our students are international students.
Professionnal development
Professionnal development
Find a training course
INSTN delivers off-the-self or tailor-made training courses to support the operational excellence of your talents.
Human capital solutions
At INSTN, we are committed to providing our partners with the best human capital solutions to develop and deliver safe & sustainable projects.
Thesis
Home   /   Thesis   /   Compositional Generalization of Visual Language Models

Compositional Generalization of Visual Language Models

Artificial intelligence & Data intelligence Computer science and software Engineering sciences Technological challenges

Abstract

The advent of the foundation models led to increase the state-of-the art performance on a large number of tasks in several fields of AI, in particular computer vision and natural language processing. However, despite the huge amount of data used to train them, these models are still limited in their ability to generalize, in particular for a use case of interest that is in a specific domain, not well represented on the Web. A way to formalize this issue is compositional generalization, i.e. generalising to a new, unseen concept from concepts learned during training. This "generalization" is the ability to learn disentangle concepts and to be able to recombine
them into unseen composition when the model is in production. The proposed thesis will address this issue, aiming at proposing visual representations that enable generic visual language models to generalize compositionally within specific domains. It will investigate strategies to reduce shortcut learning, promoting deeper understanding of compositional structures in multimodal data. It will also address the problem of compositional generalization beyond simple attribute–object pairs, capturing more subtle and complex semantics. The proposed thesis aims at proposing preogress at a quite theoretical level but has many potential practical interest, in the fields of health, administration and services sectors, security and defense, manufacturing and agriculture.

Laboratory

Département Intelligence Ambiante et Systèmes Interactifs (LIST)
Service Intelligence Artificielle pour le Langage et la Vision
Laboratoire Analyse Sémantique Textes et Images
Paris-Saclay
Top envelopegraduation-hatlicensebookuserusersmap-markercalendar-fullbubblecrossmenuarrow-down