Securing Generative AI Model: Detection of Advanced Backdoor Attacks

Artificial intelligence & Data intelligence Cyber security : hardware and sofware Technological challenges 

Abstract

This PhD aims to investigate and detect backdoor attacks within generative AI model ecosystems, including standalone models, retrieval-augmented generation systems (RAG), and LLM-based agent. The research will focus on developing novel detection and defense mechanisms against stealthy trigger-based attacks, emphasizing real-world deployment scenarios and robust evaluation benchmarks. In addition to developing defense mechanisms and releasing the code as open source, the thesis also aims to provide the scientific community with a comprehensive evaluation framework.

Context: Many users (persons, institutions, NGOs and even industries) are currently not in a position to develop their own AI agents. Thus, they may download open-source genAI models or LLM-based agents that are typically designed to be highly accessible and user-friendly, requiring minimal to no technical expertise. This practice is widespread due to the large number of open-source models and LLM agent implementations available online (e.g. Hugging Face hosts over two million public models). Unfortunately, the behavioral integrity of the downloaded model is never verified, and the model may have been previously backdoored. There is therefore an urgent need to provide defense mechanisms capable of scanning the components of a generative AI system (models and knowledge bases) and identifying those that have been poisoned.

Objective: The research will focus on developing novel detection and defense mechanisms against stealthy trigger-based attacks, emphasizing real-world deployment scenarios and robust evaluation benchmarks. In addition to developing defense mechanisms and releasing the code as open source, the thesis also aims to provide the scientific community with a comprehensive evaluation framework.

Laboratory

Département d’Instrumentation Numérique

Service Monitoring, Contrôle et Diagnostic

Laboratoire Instrumentation Intelligente, Distribuée et Embarquée

Paris-Saclay

Back

Share this thesis topic

Practicle information

Pre-requisite:

Master 2 IA, Machine Learning ou statistiques

University - graduate school:

Sciences et Technologies de l’Information et de la Communication (STIC)

Paris-Saclay

Starting date:

01-09-2026

Place:

Saclay

Contact Person

Aurélien

MAYOUE

CEA

DRT/DIN/SMCD/LIIDE

Tel : 01 69 08 88 96

Email : aurelien.mayoue@cea.fr