Secure Hardware/Software Implementation of Post-Quantum Cryptography on RISC-V Platforms

Traditional public-key cryptography algorithms are considered broken when a large-scale quantum computer is successfully realized. Consequently, the National Institute of Standards and Technology (NIST) in the USA has launched an initiative to develop and standardize new Post-Quantum Cryptography (PQC) algorithms, aiming to replace established public-key mechanisms. However, the adoption of PQC algorithms in Internet of Things (IoT) and embedded systems poses several implementation challenges, including performance degradation and security concerns arising from the potential susceptibility to physical Side-Channel Attacks (SCAs).
The idea of this Ph.D. project is to explore the modularity, extensibility and customizability of the open-source RISC-V ISA with the goal of proposing innovative, secure and efficient SW/HW implementations of PQC algorithms. One of the main challenge related to the execution of PQC algorithms on embedded processors is to achieve good performance (i.e. low latency and high throughput) and energy efficiency while incorporating countermeasures against physical SCAs. In the first phase, the Ph.D. candidate will review the State-Of-the-Art (SoA) with the objective of understanding weaknesses and attack points of PQC algorithms, the effectiveness and overhead of SoA countermeasures, and SoA acceleration strategies. In the second phase, the candidate will implement new solutions by exploiting all degrees of freedom offered by the RISC-V architecture and characterize the obtained results in terms of area overhead, execution time and resistance against SCAs.
Beyond the exciting scientific challenges, this PhD will take place in Grenoble, a picturesque city nestled in the French Alps. The research will be conducted at the CEA, in LETI and LIST institutes, and in collaboration with the TIMA laboratory.

Characterization and design of radiation-hardened HfO2-based non-volatile memories

This project concerns the characterization and design of radiation-hardened non-volatile memory circuits based on HfO2 material. This material is immune to both natural (space) and artificial (man-made) radiation, and can be used to enhance the reliability of data storage in harsh environments. What's more, when combined with FD-SOI CMOS technology, which also offers a certain degree of immunity to radiation effects, it is possible to implement highly robust memory circuits without complicating the periphery circuit, which is the most sensitive element. This thesis will study ReRAM and FeRAM memories, which are promising in terms of performance, energy efficiency and scalability, and could eventually replace conventional Flash and EEPROM memories. One or more testchips are envisaged to implement new robust design techniques and benchmark against existing solutions.

In-memory analog computing for AI attention mechanisms

The aim of this thesis is to explore the execution of attention mechanisms for Artificial Intelligence directly within a cutting-edge Non-Volatile Memory (NVM) technology.

Attention mechanisms represent a breakthrough in Artificial Intelligence (AI) algorithms and represent the performance booster behind “Transformers” neural networks.
Initially designed for natural language processing, such as ChatGPT, these mechanisms are widely employed today in embedded application domains such as: predicting demand in an energy/heat network, predictive maintenance, and monitoring of transport infrastructures or industrial sites.
Despite their widespread use, attention-based workloads demand extensive data access and computing power, resulting in high power consumption, which may be impractical to target embedded hardware systems.

The non-volatile memristor technology offers a promising solution by enabling analog computing functions with minimal power consumption while serving as non-volatile storage for AI model parameters. Massive linear algebra algorithms can be executed faster, at an ultra-low energy cost, when compared with their fully-digital implementation.
However, the technology comes with limitations, e.g., variability, the number of bits to encode model parameters (i.e. quantization), the maximum size of vectors processed in parallel, etc.

This thesis focuses on overcoming these challenges in the context of embedded time-series analysis and prediction.
The key task is exploring the mapping of attention-based mechanisms to a spin-based memristor technology developed by the SPINTEC Laboratory.
This involves quantizing and partitioning AI models to align with the hardware architecture without compromising the performance of the prediction, and exploring the implementation of particular AI blocks into the memristor analog fabric.

This thesis is part of a collaboration between CEA List, Laboratoire d’Intelligence Intégrée Multi-Capteur, the Grenoble Institute of Engineering and Management and the SPINTEC Laboratory.
Joining this research presents a unique opportunity to work within an interdisciplinary and dynamic team at the forefront of the AI ecosystem in France, with strong connections to influential industrial players in the field.

Design of algorithms to optimize RADAR beam control

The arrival on the market of a new generation of Imaging Radars 4D brings new opportunities and challenges for the development of data processing algorithms. These new sensors, geared towards the autonomous vehicle market, offer greater resolution thanks to a larger number of antennas. However, this implies an increase in the amount of data to be processed, which requires significant computing resources.
The aim of this thesis is to develop algorithms to optimize Radar resolution while limiting computational costs, in order to embed processing as close as possible to the Radar. To achieve this, beamforming techniques will be used to control the shape and direction of the Radar beam, so as to concentrate the energy in regions deemed relevant. One of the challenges is therefore to create a high-performance feedback loop to control the Radar antennas according to the scene observed during previous measurements.
This thesis will take an experimental approach, using a radar owned by the laboratory. Simulation tools will also be used to test hypotheses and go beyond the possibilities offered by the equipment.

Graph Neural Network-based power prediction of digital architectures

Performing power analysis is a major step during digital architecture development. This power analysis is needed as soon as the RTL (Register Transfer Level) coding starts, when the most rewarding changes can be made. As designs get larger, power analysis relies on longer simulation traces and becomes almost impossible, as the process generates huge simulation files (> gigabytes or terabytes of data) and long power analysis turnaround times (weeks or even months). Therefore, power models are used to speed up this step. There is a broad range of research on power modeling at RTL, mainly based on analytical or learning-based approaches. Analytical power modeling attempts to correlate application profiles such as memory behavior, branch behavior, and so on with the micro-architecture parameters to create a power model. Whereas, learning-based power modeling generates a model based on the simulation trace of the design and a reference power obtained from sign-off tools. Learning-based power modeling is gaining popularity because it is easier to implement than the analytical approach and does not require in-depth design knowledge. These ML-based methods have shown impressive improvement over analytical methods. However, the classical ML methods (linear regression, neural network, …) are more suitable to generate one model for one given architecture making them difficult to use to generate a generalizable model. Thus, in the last couple of years, a few studies have started to use Graph Neural Networks (GNN) to address model generalization in the field of electronic design automation (EDA). The advantage of a GNN over classical ML approaches is its ability to directly learn from graphs, making it more suitable for EDA problems.
The objective of this PhD is to develop a generalizable model of power consumption of digital electronic architecture, based on GNN. The developed model should be able to estimate, in addition to the average power consumption, the cycle-to-cycle power consumption of any digital electronic architecture. Very few works [1,2] exist in the state of the art on the use of GNNs for power estimation and the models developed in this work are limited to estimating the average power of an architecture. Moreover, several important research questions are not addressed in this work such as the number of data (architectures) needed for the generalization of the model, the impact of the graph structure during training, the selection of architectures used for training and for testing, the choice of features, etc.
Thus, during this PhD, these questions will be studied in order to know their impact during the generation of the model.
The work performed during this PhD thesis will be presented at international conferences and scientific journals. Certain results may be patented.

Quantum Machine Learning in the era of NISQ: can QML provide an advantage for the learning part of Neural Networks?

Quantum computing is believed to offer a future advantage in a variety of algorithms, including those challenging for traditional computers (e.g., Prime Factorization). However, in an era where Noisy Quantum Computers (QCs) are the norm, practical applications of QC would be centered around optimization approaches and energy efficiency rather than purely algorithmic performance.

In this context, this PhD thesis aims to address the utilization of QC to enhance the learning process of Neural Networks (NN). The learning phase of NN is arguably the most power-hungry aspect with traditional approaches. Leveraging quantum optimization techniques or quantum linear system solving could potentially yield an energy advantage, coupled with the ability to perform the learning phase with a less extensive set of training examples.

In-physics artificial intelligence using emerging nanodevices

Recent breakthroughs in models of AI are correlated with the energy burden required to define and run these models. GPUs are the goto hardware for these implementations, since they can perform configurable, highly parallelised and matrix multiplications using digital circuits. To go beyond the energy limits of GPUs however, it may be required to abandon the digital computing paradigm altogether.

A particularly elegant solution may be to exploit the intrinsic physics of electron devices in an analogue fashion. For example, early work has already proposed how physical entropy of silicon devices can realise probabilistic learning algorithms, how voltage relaxation in resistive networks may approximate gradients, and how the activity of interconnected oscillators may converge minima on energy surfaces.

The objective of this thesis will be to study existing, and propose new, in-physics computing primitives. Furthermore, like GPUs bias current AI to rely on matrix multiplications, the candidate must also consider how these new primitives will impact future AI algorithms. Particular attention will be given to emerging nanodevice technologies under development at CEA Grenoble. Depending on the interests of the PhD student, it may be possible to design, tape-out and test circuit concepts leveraging these in-house innovative technologies.

Formalization and Analysis of Countermeasures Against Fault Injection Attacks on Open-source Processors

Join our dynamic research team at CEA-List within the DSCIN division for a PhD opportunity in the field of hardware security and formal analysis of processor micro-architectures. The focus of this research is the formalization and analysis of countermeasures against fault injection attacks on open-source processors. Operating at the cutting edge of cyber-security for embedded systems, we aim to build formal guarantees for the robustness of these systems in the face of evolving security threats, particularly those arising from fault injection attacks.

As a PhD candidate, you will contribute to advancing the understanding of fault injection attacks and their impact on both hardware and software aspects of open-source processors. The scientific challenge lies in devising methods and tools that can effectively analyze the robustness of embedded systems under fault injection. You will work on jointly considering the RTL model of the target processor and the executed program, addressing the limitations of current methods (be it simulation or formal analysis), and exploring innovative approaches to scale the analysis to larger programs and complex processor microarchitectures. The experimental work will be based on RTL simulators such as Verilator or QuestaSim, the formal analysis tool µARCHIFI developped at CEA-List, and open-source implementations of secured processors such as the RISC-V processor CV32E40S.

Upon the successful completion of this PhD thesis, you will have contributed to the development of formalized countermeasures against fault injection attacks. This research not only aligns with the broader goals of enhancing cyber-security for embedded systems but also has practical implications, such as contributing to the security verification of realistic secured architectures. Additionally, your work will pave the way for the design of efficient techniques and tools that have the potential to streamline the evaluation of secured systems, impacting fields like Common Criteria certification and reducing time-to-market during the design phase of secure systems.

Partitioning of spin Qubits control electronics architecture: co-design of cryoCMOS and room-temperature hardware

Quantum algorithms capable of demonstrating a quantum advantage will require the use of quantum processors (QPU) with several thousands of qubits. The design of such a quantum computer is a multidisciplinary challenge at the heart of quantum engineering. Control electronics face particular constraints related to the cryogenic temperature at which qubits operate. Leveraging its expertise in silicon-based technologies, the CEA aims to integrate thousands of semiconductor qubits within a single QPU.

The primary objective of this thesis is to propose an innovative digital and analog qubit control architecture that scales to thousands of spin qubits, by distributing electronics between different stages of the cryostat and the exterior at ambient temperature. The second objective is to create prototypes of this control chain to demonstrate the feasibility and performance of such an architecture.

The work will build upon an existing architecture at ambient temperature and microelectronic blocks developed at cryogenic temperatures within the CEA. New blocks and corresponding circuits will be developed to reach the targetted scale-out quantum architecture. The circuits will be fabricated, tested and measured, and will be published in scientific publications.

Clip approach for improving energy efficiency of hardware embedding combinations

In a global context of task automation, artificial neural networks are currently used in many domains requiring the processing of data from sensors: vision, sound, vibration.
Depending on different constraints, the information processing can be done on the Cloud (SIRI, AWS, TPU) or in an embedded way (NVidia's Jetson platform, Movidius, CEA-LIST's PNeuro/DNeuro). In this second case, many hardware constraints must be taken into account when dimensioning the algorithm. In order to improve the porting on hardware platforms, LIST has developed innovative state-of-the-art methods allowing to aggressively quantize the parameters of a neural network as well as to modify the coding of the activations to reduce the number of calculations to be performed.
The energy efficiency of neuromorphic architectures with equivalent technology is constrained by the classic paradigm of flexibility vs. efficiency. In other words, the more different tasks (and networks) an architecture is capable of performing, the less energy-efficient it becomes. While this relationship cannot be circumvented for a wide variety of algorithms, neural networks are parametric functions, learned for one and therefore potentially adaptable to other tasks by partial modification of the topology and/or parameters.
One technique, CLIP, seems to provide an answer, with a strong capacity for adaptation to a variety of tasks and the possibility of using multimodality. In its original form, CLIP is presented as a method for matching text and images to create a classification task.
The aim of this thesis is to study the hardware implementation of CLIP by proposing a dedicated architecture. The thesis is organized into 3 main phases, beginning with a study of CLIP's mechanisms, the operations to be performed and the consequences for embedding networks. Secondly, hardware optimizations applicable to CLIP, such as quantization (or others) and an estimation of flexibility vs. applicative generality. Finally, an architectural and implementation proposal to measure energy efficiency.