Non-volatile asynchronous magnetic SRAM design

In the applicative context of sensor nodes as in Internet of things (IoT) and for Cyber Physical Systems (CPS), normally-off systems are mainly in a sleeping state while waiting events such as timer alarms, sensor threshold crossing, RF or also energetic environment variations to wake up. To reduce power consumption or due to missing energy, the system may power off most of its components while sleeping. To maintain coherent information in memory, we aim at developing an embedded non-volatile memory component. Magnetic technologies are promising candidates to reach both low power consumption and high speed. Moreover, due to transient behavior, switching from sleeping to running state back and forth, asynchronous logic is a natural candidate for digital logic implementation. The position is thus targeting the design of an asynchronous magnetic SRAM in a 28nm technology process. The memory component will be developed down to layout view in order to precisely characterize power and timing performances and allow integration with an asynchronous processor. Designing such a component beyond current state of the art will allow substantial breakthrough in the field of autonomous systems.

Detection of cyber-attacks in a smart multi-sensor embedded system for soil monitoring

The post-doc is concerned with the application of machine learning methods to detect potential cyber-security attacks on a connected multi-sensor system. The application domain is the agriculture, where CEA Leti has several projects, among which the H2020 project SARMENTI (Smart multi-sensor embedded and secure system for soil nutrient and gaseous emission monitoring). The objective of SARMENTI is to develop and validate a secure, low power multisensor systems connected to the cloud to make in situ soil nutrients analysis and to provide decision support to the farmers by monitoring soil fertility in real-time. Within this topic, the postdoc is concerned with the cyber-security analysis to determine main risks in our multi-sensor case and with the investigation of a attack detection module. The underlying detection algorithm will be based on anomaly detection, e.g., one-class classifier. The work has tree parts, implement the probes that monitor selected events, the communication infrastructure that connects the probes with the detector, and the detector itself.

Hardening energy efficient security features for the IoT in FDSOI 28nm technology

The security of the IoT connected objects must be energy efficient. But most of the work
around hardening by design show an additional cost, a multiplying factor of 2 to 5, on the
surface, performance, power and energy, which does not meet the constraints of the IoT.
Last 5 years research efforts on hardening have been guided by reducing silicon area or
power, which do not always imply a decrease in energy, predominant criterion in autonomous
connected objects. The postdoc topic addresses the hardening and energy consumption
optimization of the implementation of security functions (attack detection sensors,
cryptographic accelerator, random number generator, etc.) in 28nm FDSOI technology.
From the selection of existing security bricks, unhardened in FPGA technology, the postdoc
will explore hardening solutions at each step of the design flow in order to propose and
to validate, into a silicon demonstrator, the most energy efficient countermeasures that
guarantee a targeted security level.
To achieve those goals, the postdoc can rely on existing methodologies of design and of
security evaluation thanks to test benches and attack tools.

Software and hardware combined acceleration solution for operations research algorithms

The purpose of the study is to prepare the next generation of OR solvers. We will study the hardware acceleration possibility based on FPGA to run some or all of the OR algorithm. The blocks for which such a solution is not effective can be parallelized and executed on a standard computing platform. Thus, the proposed runtime correspond to a standard computing platform integrating FPGA. To access to this platform we require a set of tools. These tools should provide features such as (a) analysis and pre-compiling an input or problem or sub-problem of OR, (b) HW / SW partitioning and dedicated logic optimization and finally (c) generating an software executable and a bitstream.
The first step will be to find OR algorithms that are well suited for hardware acceleration. We then propose a HW / SW partitioning methodologies for different classes of algorithms.
The results will be implemented to lead to a compilation prototype starting from an OR instance and generating a software executable and a bitstream. Theses results will be implemented and executed on a computing platform integrating FPGA to evaluate the performance gain and the impact on the energy consumption of the proposed solution.

3D occupancy grid analysis with a deep learning approach

The context of this subject is the development of autonomous vehicles / drones / robots.
The vehicle environment is represented by a 3D occupancy grid, in which each cell contains the probability of presence of an object. This grid is refreshed over time, thanks to sensor data (Lidar, Radar, Camera).
Higher-level algorithms, like path planning or collision avoidance, think in terms of objects described by their path, speed, and nature. It is thus mandatory to get these objects from individual grid cells, with clustering, classification, and tracking.
Many previous publications on this topic comes from the context of vision processing, many of them using deep learning. They show a big computational complexity, and do not benefit from occupancy grids specific characteristics (lack of textures, a priori knowledge of areas of interest…). We want to explore new techniques, tailored to occupation grids, and more compatible with embedded and low cost implementation.

The objective of the subject is to determine, from a series of 3D occupation grids, the number and the nature of the different objects, their position and velocity vector, exploiting the recent advances of deep learning on unstrucured 3D data.

Automatic generation of dynamic code generators from legacy code

Our laboratory is developing a technology for dynamic code generation around a tool called deGoal. deGoal is a tool designed to build specialized code generators (also known as compilettes) customized for each computing kernel we want to accelerate in an application. Such compilettes are designed with the aim to perform data- and architecture-dependent code optimizations and code generation at runtime. Furthermore, compilettes provide very fast code generation and low memory footprint. This approach is fundamentally different from the standard approach for dynamic compilation as used for example in Java Virtual Machines.
In order to target computing architectures that include domain-specific accelerators and to raise the level of abstraction of the source code of compilettes, deGoal uses a dedicated language. This language provides the best performance we can achieve from our technology, and has demonstrated its ability to achieve good performance improvements compared to highly optimised static code. However, the drawback is that one needs to rewrite the source code of a computing kernel from scratch in order to build a new compilette.

The goal of this PostDoc is to implement an automatic generator of compilettes able to work from existing source code (typically: ANSI C), and able to be integrated in an industry-grade code generation toolchain.

Scalable digital architecture for Qubits control in Quantum computer

Scaling Quantum Processing Units (QPU) to hundreds of Qubits leads to profound changes in the Qubits matrix control: this control will be split between its cryogenic part and its room temperature counterpart outside the cryostat. Multiple constraints coming from the cryostat (thermal or mechanical constraints for example) or coming from Qubits properties (number of Qubits, topology, fidelity, etc…) can affect architectural choices. Examples of these choices include Qubits control (digital/analog), instruction set, measurement storage, operation parallelism or communication between the different accelerator parts for example. This postdoctoral research will focused on defining a mid- (100 to 1,000 Qubits) and long-term (more than 10,000 Qubits) architecture of Qubits control at room temperature by starting from existing QPU middlewares (IBM QISKIT for example) and by taking into account specific constraints of the QPU developed at CEA-Leti using solid-state Qubits.

modelling and Control of voltage and frequancy in GALS architecture submitted to Process-Voltage-Temperature variability

The evolution of sub-micron technologies has induced tremendous challenges the designer has to face, namely, the Process-Voltage-Temperature varibility and the decrase of power consumption for mobile applications.
The work to be done here concerns the DVFS (Dynamic Voltage and Frequency Scaling) policies for GALS (Globally Asynchronous, Locally Synchronous) architecture.
A fine grain modelling of the voltage and frequency “actuators” must be first done in order to simulate in a realistic ways the physical phenomena. Especially, the various parameters that may influence the system will be considered (process variation, supply voltage variation and noise, temperature variation, etc.)
Then, Non-Linear (NL) control laws that take into account the saturation of the actuators will be developed. These laws will be validated on the physical simulator and their performances in regulation (i.e. the response of the closed-loop system to disturbances such as PVT variations) will be evaluated. Note that these laws will be designed at the light of implementation constraints (mainly cost) in terms of complexity, area, etc.
Actually, the system considered here is intrinsically a Multi-Inputs-Multi-Outputs (MIMO) one. Therefore, its control can be design with NL techniques devoted to MIMO systems in order to ensure the requirements and reject the disturbances.
The control of several Voltage and Frequency Islands (VFI) is usually done via a “central brain” that chooses the voltage and frequency references thanks to a computational workload deadline. For more advanced architectures, the capabilities of each processing element, especially its maximum frequency, can be taken into account. A disruptive approach should be to consider a more distributed control that for instance takes into account the particular state (e.g. temperature) of each VFI neighbours. Control techniques that have been designed for distributed Network Controlled Systems could be adapted to MPSoCs.

Design of a safe and secure hypervisor in the context of a manycore architecture

The TSUNAMY project aims at thinking the design of future manycore chips in a collaborative hardware/software approach. It will investigate how crypto-processors can be incorporated into such a chip, turning it into a heterogeneous architecture, where scheduling, resource allocation, resource sharing, and resource isolation will be a concern.

The LaSTRE laboratory has designed Anaxagoros, a micro-kernel which ensures good properties in terms of safety and integration of mixed-criticality applications and is therefore well suited to the virtualization of operating systems. Making this virtualization software layer evolve in the context of the TSUNAMY project is the main goal of this post-doctoral proposal.

The first issue to address deals with the scalability of Anaxagoros on a manycore architecture. This system was designed with multicore scalability in mind : to help reach the highest level of parallelism in a lock-free fashion, innovative techniques were proposed to minimize the amount of synchronization points within the system. This is the first step, but scaling to manycore architectures brings new topics such as cache-coherency or non-uniform memory access that require to focus on data locality as well. The second challenge will be to incorporate genuine security features into Anaxagoros, e.g. regarding protection from covert channels, or confidentiality. The third and final challenge that will be addressed through interactions with the partners of the project is to devise techniques that could be implemented directly in hardware in order to ensure that even a breach in what is usually considered as trusted software will not allow an attacker to gain unprivileged data access or let information leak.