Bayesian Inference with Differentiable Simulators for the Joint Analysis of Galaxy Clustering and CMB Lensing

The goal of this PhD project is to develop a novel joint analysis for the DESI galaxy clustering
and Planck PR4/ACT CMB lensing data, based on numerical simulations of the surveys and
state-of-the-art machine learning and statistical inference techniques. The aim is to overcome
many of the limitations of the traditional approaches and improve the recovery of cosmological
parameters. The joint galaxy clustering - CMB lensing inference will significantly improve
constraints on the growth of structure upon DESI-only analyses and refine even more the test of general relativity.

A revolution in intervention in complex environments: AI and Digital twins in synergy for innovative and effective solutions.

Scientific Context
The operation of complex equipment, particularly in the nuclear sector, relies on quick and secure access to heterogeneous data. Advances in generative AI, combined with Digital Twins (DT), offer innovative solutions to enhance human-system interactions. However, integrating these technologies into critical environments requires tailored approaches to ensure intuitiveness, security, and efficiency.

Proposed Work
This thesis aims to develop a generative AI architecture enriched with domain-specific data and accessible via mixed reality, enabling a glovebox operator to ask natural language questions. The proposed work includes:

A review of the state-of-the-art on Retrieval-Augmented Generation (RAG), ASR/TTS technologies, and Digital Twins.
The development and integration of a chatbot for nuclear operations.
The evaluation of human-AI interactions and the definition of efficiency and adoption metrics.
Expected Outcomes
The project aims to enhance safety and productivity through optimized interactions and to propose guidelines for the adoption of such systems in critical environments.

Deterministic neutron calculation of soluble-boron-free PWR-SMR reactors based on Artificial Intelligence

In response to climate challenges, the quest for clean and reliable energy focuses on the development of small modular reactors using pressurized water (PW-SMR), with a power range of 50 to 1000 MWth. These reactors aimed at decarbonizing electricity and heat production in the coming decade. Compared to currently operating reactors, their smaller size can simplify design by eliminating the need for soluble boron in the primary circuit water. Consequently, control primarily relies on the level of insertion of control rods, which disturb the spatial power distribution when control rods are inserted, implying that power peaks and reactivity are more difficult to manage than in a standard PWR piloted with soluble boron. Accurately estimating these parameters poses significant challenges in neutron modeling, particularly regarding the effects of the history of control rod insertion on the isotopic evolution of the fuel. A thesis completed in 2022 explored these effects using an analytical neutron model, but limitations persist as neutron absorbers movements are not the only phenomena influencing the neutron spectrum. The proposed thesis seeks to develop an alternative method that enhances robustness and further reduces the calculation biases. A sensitivity analysis will be conducted to identify key parameters, enabling the creation of a meta-model using artificial intelligence to correct biases in existing models. This work, conducted in collaboration with IRSN and CEA, will provide expertise in reactor physics, numerical simulations, and machine learning.

AI based prediction of solubilities for hydrometallurgy applications

Finding a selective and efficient extractant is one of the main challenges of hydrometallurgy. A comprehensive screening is impossible by the synthesis/test method due to the high number of possible molecules. Instead, more and more studies use quantum calculations to evaluate the complexes stabilities. Still, some important parameters such as solubility are lacking in this model.
This project thus aims to develop an AI based tool that provides solubility values from the molecular structure of any ligand. The study will first focus on 3 solvants: water, used as a reference as AI tools already exist, 3 M nitric acid to mimic nuclear industry applications and n-octanol, organic solvent used to measure the partition coefficient logP. The methodology follows 4 steps:
1) Bibliography on existing AI tools for solubility prediction yielding the choice of the most promising method(s)
2) Bibliography on existing databases to be complemented by the student's in-lab solubility experiments
3) Code generation and training of the neural network on the step 2 databases
4) Checking the accuracy of the predictions on molecules not included in the databases by comparing the calculated results with in-lab experiments

Integrity, availability and confidentiality of embedded AI in post-training stages

With a strong context of regulation of AI at the European scale, several requirements have been proposed for the "cybersecurity of AI" and more particularly to increase the security of complex modern AI systems. Indeed, we are experience an impressive development of large models (so-called “Foundation” models) that are deployed at large-scale to be adapted to specific tasks in a wide variety of platforms and devices. Today, models are optimized to be deployed and even fine-tuned in constrained platforms (memory, energy, latency) such as smartphones and many connected devices (home, health, industry…).

However, considering the security of such AI systems is a complex process with multiple attack vectors against their integrity (fool predictions), availability (crash performance, add latency) and confidentiality (reverse engineering, privacy leakage).

In the past decade, the Adversarial Machine Learning and privacy-preserving machine learning communities have reached important milestones by characterizing attacks and proposing defense schemes. Essentially, these threats are focused on the training and the inference stages. However, new threats surface related to the use of pre-trained models, their unsecure deployment as well as their adaptation (fine-tuning).

Moreover, additional security issues concern the fact that the deployment and adaptation stages could be “on-device” processes, for instance with cross-device federated learning. In that context, models are compressed and optimized with state-of-the-art techniques (e.g., quantization, pruning, Low Rank Adaptation) for which their influence on the security needs to be assessed.

The objectives are:
(1) Propose threat models and risk analysis related to critical steps, typically model deployment and continuous training for the deployment and adaptation of large foundation models on embedded systems (e.g., advanced microcontroller with HW accelerator, SoC).
(2) Demonstrate and characterize attacks, with a focus on model-based poisoning.
(3) Propose and develop protection schemes and sound evaluation protocols.

Efficient Multimodal Vision Transformers for Embedded System

The proposed thesis focuses on the optimization of multimodal vision transformers (ViT) for panoptic object segmentation, exploring two main directions. The first is to develop a versatile fusion pipeline to integrate multimodal data (RGB, IR, depth, events, point clouds) by leveraging inter-modal alignment relationships. The second is to investigate an approach combining pruning and mixed-precision quantization. The overall goal is to design lightweight multimodal ViT models, tailored to the constraints of embedded systems, while optimizing their performance and reducing computational complexity.

Top