Compositional Generalization of Visual Language Models

The advent of the foundation models led to increase the state-of-the art performance on a large number of tasks in several fields of AI, in particular computer vision and natural language processing. However, despite the huge amount of data used to train them, these models are still limited in their ability to generalize, in particular for a use case of interest that is in a specific domain, not well represented on the Web. A way to formalize this issue is compositional generalization, i.e. generalising to a new, unseen concept from concepts learned during training. This "generalization" is the ability to learn disentangle concepts and to be able to recombine
them into unseen composition when the model is in production. The proposed thesis will address this issue, aiming at proposing visual representations that enable generic visual language models to generalize compositionally within specific domains. It will investigate strategies to reduce shortcut learning, promoting deeper understanding of compositional structures in multimodal data. It will also address the problem of compositional generalization beyond simple attribute–object pairs, capturing more subtle and complex semantics. The proposed thesis aims at proposing preogress at a quite theoretical level but has many potential practical interest, in the fields of health, administration and services sectors, security and defense, manufacturing and agriculture.

Towards a Sustainable Blockchain: Reducing Energy Consumption While Ensuring Security and Integrity

Blockchain technology, a key component of distributed ledger systems, enables decentralized digital interactions without the need for a central authority but raises environmental concerns due to its energy consumption, particularly with proof-of-work (PoW) mechanisms like Bitcoin. The literature highlights the sustainability challenges associated with this energy consumption. Several strategies have been proposed to mitigate these impacts, such as optimizing cryptographic puzzles, implementing two-round mining processes, and integrating renewable energy sources. Alternative consensus mechanisms like Proof-of-Stake (PoS) and Proof-of-Authority (PoA) are also explored. This research project aims to evaluate the energy consumption profiles of existing blockchain systems and propose new, more efficient consensus algorithms. It also focuses on integrating renewable energy sources and optimizing smart contracts to reduce their resource consumption. A thorough security analysis will ensure that energy efficiency improvements do not compromise network security and decentralization. Using simulation tools, this research will quantify the improvements brought by new algorithms and strategies, contributing to the sustainability and broader adoption of blockchain technology in an environmentally conscious manner.

Attention-based Binarized Visual Encoder for LLM-driven Visual Question Answering

In the context of smart image sensors, there is an increasing demand to go beyond simple inferences such as classification or object detection, to add more complex applications enabling a semantic understanding of the scene. Among these applications, Visual Question Answering (VQA) enables AI systems to answer questions by analyzing images. This project aims to develop an efficient VQA system combining a visual encoder based on Binary Neural Networks (BNN) with a compact language model (tiny LLM). Although LLMs are still far from a complete hardware implementation, this project represents a significant step in this direction by using a BNN to analyze the context and relationship between objects of the scene. This encoder processes images with low resource consumption, allowing real-time deployment on edge devices. Attention mechanisms can be taken into consideration to extract the semantic information necessary for scene understanding. The language model used can be stored locally and adjusted jointly with the BNN to generate precise and contextually relevant answers.
This project offers an opportunity for candidates interested in Tiny Deep Learning and LLMs. It proposes a broad field of research for significant contributions and interesting results for concrete applications. The work will consist of developing a robust BNN topology for semantic scene analysis under certain hardware constraints (memory and computation) and integrating and jointly optimizing the BNN encoder with the LLM, while ensuring a coherent and performant VQA system across different types of inquiries.

Scalability of the Network Digital Twin in Complex Communication Networks

Communication networks are experiencing an exponential growth both in terms of deployment of network infrastructures (particularly observed in the gradual and sustained evolution towards 6G networks), but also in terms of machines, covering a wide range of devices ranging from Cloud servers to lightweight embedded IoT components (e.g. System on Chip: SoC), and including mobile terminals such as smartphones.

This ecosystem also encompasses a variety of software components ranging from applications (e.g. A/V streaming) to the protocols from different communication network layers. Furthermore, such an ecosystem is intrinsically dynamic because of the following features:
- Change in network topology: due, for example, to hardware/software failures, user mobility, operator network resource management policies, etc.
- Change in the usage/consumption ratio of network resources (bandwidth, memory, CPU, battery, etc.). This is due to user needs and operator network resource management policies, etc.

To ensure effective supervision or management, whether fine-grained or with an abstract view, of communication networks, various network management services/platforms, such as SNMP, CMIP, LWM2M, CoMI, SDN, have been proposed and documented in the networking literature and standard bodies. Furthermore, the adoption of such management platforms has seen broad acceptance and utilization within the network operators, service providers, and the industry, where the said management platforms often incorporate advanced features, including automated control loops (e.g. rule-based, expert-system-based, ML-based), further enhancing their capability to optimize the performance of the network management operations.

Despite the extensive exploration and exploitation of these network management platforms, they do not guarantee an effective (re)configuration without intrinsic risks/errors, which can cause serious outage to network applications and services. This is particularly true when the objective of the network (re)configuration is to ensure real-time optimization of the network, analysis/ tests in operational mode (what- if analysis), planning updates/modernizations/extensions of the communication network, etc. For such (re)configuration objectives, a new network management paradigm has to be designed.

In the recent years, the communication network research community started exploring the adoption of the digital twin concept for the networking context (Network Digital Twin: NDT). The objective behind this adoption is to help for the management of the communication network for various purposes, including those mentioned in the previous paragraph.

The NDT is a digital twin of the real/physical communication network (Physical Twin Network: PTN), making it possible to manipulate a digital copy of the real communication network, without risk. This allow in particular for visualizing/predicting the evolution (or the behavior, the state) of the real network, if this or that network configuration is to be applied. Beyond this aspect, the NDT and the PTN network exchange information via one or more communication interfaces with the aim of maintaining synchronized states between the NDT and the PTN.

Nonetheless, setting up a network digital twin (NDT) is not a simple task. Indeed, frequent and real-time PTN-NDT synchronization poses a scalability problem when dealing with complex networks, where each network information is likely to be reported at the NDT level (e.g. a very large number of network entities, very dynamic topologies, large volume of information per node/per network link).

Various scientific contributions have attempted to address the question of the network digital twin (NDT). The state-of-the-art contributions focus on establishing scenarios, requirements, and architecture for the NDT. Nevertheless, the literature does not tackle the scalability problem of the NDT.

The objective of this PhD thesis is to address the scalability problem of network digital twins by exploring new machine learning models for network information selection and prediction.

Learning world models for advanced autonomous agent

World models are internal representations of the external environment that an agent can use to interact with the real world. They are essential for understanding the physics that govern real-world dynamics, making predictions, and planning long-horizon actions. World models can be used to simulate real-world interactions and enhance the interpretability and explainability of an agent's behavior within this environment, making them key components for advanced autonomous agent models.
Nevertheless, building an accurate world model remains challenging. The goal of this PhD is to develop methodology to learn world models and study their use in the context of autonomous driving, particularly for motion forecasting and developing autonomous agents for navigation.

Secure and Agile Hardware/Software Implementation of new Post-Quantum Cryptography Digital Signature Algorithms

Cryptography plays a fundamental role in securing modern communication systems by ensuring confidentiality, integrity, and authenticity. Public-key cryptography, in particular, has become indispensable for secure data exchange and authentication processes. However, the advent of quantum computing poses an existential threat to many of the traditional public-key cryptographic algorithms, such as RSA, DSA, and ECC, which rely on problems like integer factorization and discrete logarithms that quantum computers can solve efficiently. Recognizing this imminent challenge, the National Institute of Standards and Technology (NIST) initiated in 2016 a global effort to develop and standardize Post-Quantum Cryptography (PQC). After three rigorous rounds of evaluation, NIST announced its first set of standardized algorithms in 2022. While these algorithms represent significant progress, NIST has expressed an explicit need for additional digital signature schemes that leverage alternative security assumptions, emphasizing the importance of schemes that offer shorter signatures and faster verification times to enhance practical applicability in resource-constrained environments. Building on this foundation, NIST opened a new competition to identify additional general-purpose signature schemes. The second-round candidates, announced in October 2024, reflect a diverse array of cryptographic families.

This research focuses on the critical intersection of post-quantum digital signature algorithms and hardware implementations. As the cryptographic community moves toward adoption, the challenge lies not only in selecting robust algorithms but also in deploying them efficiently in real-world systems. Hardware implementations, in particular, must address stringent requirements for performance, power consumption, and security, while also providing the flexibility to adapt to multiple algorithms—both those standardized and those still under evaluation. Such agility is essential to future-proof systems against the uncertainty inherent in cryptographic transitions. The primary objective of this PhD research is to design and develop hardware-agile implementations for post-quantum digital signature algorithms. The focus will be on supporting multiple algorithms within a unified hardware framework, enabling seamless adaptability to the diverse needs of evolving cryptographic standards. This involves an in-depth study of the leading candidates from NIST’s fourth-round competition, as well as those already standardized, to understand their unique computational requirements and security properties. Special attention will be given to designing modular architectures that can support different signatures, ensuring versatility and extensibility. The proposed research will also explore optimizations for resource efficiency, balancing trade-offs between performance, power consumption, and area utilization. Additionally, resilience against physical attacks (side-channel attacks and fault injection attacks) will be a key consideration in the design process. This PhD project will be conducted within the PEPR PQ-TLS project in collaboration with the TIMA laboratory (Grenoble), the Agence nationale de la sécurité des systèmes d’information (ANSSI) and INRIA.

Software support for sparse computation

The performance of computers has become limited by data movement in the fields of AI, HPC and embedded computing. Hardware accelerators do exist to handle data movement in an energy-efficient way, but there is no programming language that allows them to be implemented in the code supporting the calculations.

It's up to the programmer to explicitly configure DMAs and use function calls for data transfers and do program analysis to identify memory bottleneck

In addition, compilers were designed in the 80s, when memories worked at the same frequency as computing cores.

The aim of this thesis will be to integrate into a compiler the ability to perform optimizations based on data transfers.

HW/SW Contracts for Security Analysis Against Fault Injection Attacks on Open-source Processors

This thesis focuses on the cybersecurity of embedded systems, particularly the vulnerability of processors and programs to fault injection attacks. These attacks disrupt the normal functioning of systems, allowing attackers to exploit weaknesses to access sensitive information. Although formal methods have been developed to analyze the robustness of systems, they often limit their analyses to hardware or software separately, overlooking the interaction between the two.

The proposed work aims to formalize hardware/software (HW/SW) contracts specifically for security analysis against fault injection. Building on a hardware partitioning approach, this research seeks to mitigate scalability issues related to the complexity of microarchitecture models. Expected outcomes include the development of techniques and tools for effective security verification of embedded systems, as well as the creation of contracts that facilitate the assessment of compliance for both hardware and software implementations. This approach could also reduce the time-to-market for secure systems.

Cryptographic security of RISC-V processor enclaves with CHERI

CHERI (Capability Hardware Enhanced RISC Instructions) is a solution for securing the processor against spatial and temporal memory leaks by transforming any pointer into a capability that clearly defines the access limits to the data or instructions addressed.
In this thesis, we propose to enrich CHERI and its control-flow integrity capabilities on a RISC-V application processor, by protecting instructions right up to their execution against any type of modification. Secondly, based on authenticated memory encryption, we will study the possibility of using CHERI to define secure enclaves enabling cryptographic isolation between processes. The processor will be modified so that each process is encrypted with its own key and can have a secure life cycle. All keys must be efficiently protected in hardware.

Contact : olivier.savry@cea.fr

Combining over and underapproximation of memory abstractions for low-level code analysis

Rice's theorem stating that no method can automatically tell whether a property of a program is true or not has led to the separation of verification tools into two groups: sound tools operating by over-approximation, such as abstract interpretation, are able to automatically prove that certain properties are true, but are sometimes unable to conclude and produce alarms; conversely, complete tools operating by under-approximation, such as symbolic execution, are able to produce counter-examples, but are unable to demonstrate whether a property is true.

*The general aim of the thesis is to study the combination of sound and complete methods of programanalysis, and in particular static analysis by abstract interpretation and the generation of underapproximated formulae by symbolic execution*.

We are particularly interested in the combination of over- and sub-approximating abstractions, especially for memory. The priority applications envisaged concern the analysis of code at the binary level, as achieved by the combination of the BINSEC and CODEX analysis platforms, so as to automatically discover new security vulnerabilities, or prove their absence.