Compositional Generalization of Visual Language Models

Artificial intelligence & Data intelligence Computer science and software Engineering sciences Technological challenges 

Abstract

The advent of the foundation models led to increase the state-of-the art performance on a large number of tasks in several fields of AI, in particular computer vision and natural language processing. However, despite the huge amount of data used to train them, these models are still limited in their ability to generalize, in particular for a use case of interest that is in a specific domain, not well represented on the Web. A way to formalize this issue is compositional generalization, i.e. generalising to a new, unseen concept from concepts learned during training. This "generalization" is the ability to learn disentangle concepts and to be able to recombine
them into unseen composition when the model is in production. The proposed thesis will address this issue, aiming at proposing visual representations that enable generic visual language models to generalize compositionally within specific domains. It will investigate strategies to reduce shortcut learning, promoting deeper understanding of compositional structures in multimodal data. It will also address the problem of compositional generalization beyond simple attribute–object pairs, capturing more subtle and complex semantics. The proposed thesis aims at proposing preogress at a quite theoretical level but has many potential practical interest, in the fields of health, administration and services sectors, security and defense, manufacturing and agriculture.

Laboratory

Département Intelligence Ambiante et Systèmes Interactifs (LIST)

Service Intelligence Artificielle pour le Langage et la Vision

Laboratoire Analyse Sémantique Textes et Images

Paris-Saclay

Back

Share this thesis topic

Practicle information

Pre-requisite:

master 2 en data science ou math appliquées

University - graduate school:

Paris-Saclay

Starting date:

01-10-2025

Place:

Saclay

Contact Person

Aboubacar

TUO

CEA

DRT/DIASI//LVA

Tel : 0656802188

Email : aboubacar.tuo@cea.fr

Thesis supervisor

Hervé

LE BORGNE

CEA

DRT/DIASI//LASTI