Cette journée apportera des éléments de réponse au travers d'une série de présentations sur les technologies émergentes destinées à accélérer et optimiser ces modèles : architectures neuromorphiques, calcul en mémoire, processeurs photoniques, ainsi que les techniques d’accélération et d’optimisation du déploiement des réseaux de neurones sur FPGA et microcontrôleurs basse consommation.
Dates : le 27 Novembre 2025 de 9:00 à 16:00
Lieu : IRT St Exupéry, Toulouse
Organisateurs : Karol DESNOS, Eric JENN et Arthur PERAIS
Thématiques : Intelligence Artificielle et Systèmes Embarqués
Les réseaux de neurones sont au cœur de la révolution de l’intelligence artificielle, mais leur déploiement reste contraint par des enjeux de puissance, de vitesse et d’efficacité énergétique.Comment concilier performance, consommation et coût ? Quelles avancées technologiques permettent d’intégrer l’IA en périphérie (edge) tout en réduisant son empreinte énergétique ?
Cette journée apportera des éléments de réponse au travers d'une série de présentations sur les technologies émergentes destinées à accélérer et optimiser ces modèles : architectures neuromorphiques, calcul en mémoire, processeurs photoniques, ainsi que les techniques d’accélération et d’optimisation du déploiement des réseaux de neurones sur FPGA et microcontrôleurs basse consommation.
https://evenium.events/0tchcwxx
Artificial intelligence (AI) offers transformative potential for edge applications, yet its high energy demands hinder deployment in domains such as medical implants and brain-machine interfaces. Conventional computing architectures are especially inefficient for AI workloads due to the energy cost of transferring data between memory and processing units.
This talk explores how in-memory and near-memory computing can address these limitations by tightly integrating logic and memory to drastically reduce energy consumption. We highlight recent progress in non-volatile memory technologies, including memristors, magnetic memory, and phase-change memory, that now enable fully functional in-memory computing systems. Both digital and analog in-memory computing paradigms are considered: the former supports robust low-power neural networks, while the latter performs computation inherently through fundamental physical laws. Using fabricated hybrid CMOS-memristor circuits, we compare the strengths and trade-offs of these two approaches.
New memory technologies often introduce variability. However, we show how Bayesian methods can mitigate—and sometimes even leverage—these imperfections to enable trustworthy and uncertainty-aware AI. These methods are demonstrated experimentally on arrhythmia detection with uncertainty evaluation and continual learning for cancer recognition, using real memristors embedded in prototype systems. We also discuss recent advances in local learning rules, such as forward-only approaches, which promise fully on-chip learning within in-memory architectures.
The talk concludes with an outlook on
the architectural and algorithmic challenges ahead, and the
opportunities for building a new class of energy-efficient, reliable AI
systems at the extreme edge.
This work was supported by France 2030 government grants (ANR-22-PEEL-0010, ANR-22-PEEL-0013, ANR-23-PEIA-0002).
Cette présentation donnera un tour d'horizon des techniques architecturales utilisées pour concevoir des accélérateurs pour l'inférence dans les systèmes d'apprentissage machine (ML). Un focus sera fait sur l’accélération du produit de matrice à base de réseaux systoliques, comme motif de base pour le ML. Ensuite, plusieurs architectures d’accélérateurs pour le ML (GPU, TPU, NPU) seront présentées.
In highly constrained embedded systems where energy availability is limited, Field Programmable Gate Arrays (FPGAs) are often favored over the more power-hungry Graphics Processing Units (GPUs). However, the range of available solutions remains limited, with most relying on High-Level Synthesis (HLS)-to-bitstream approaches such as HLS4ML, or vendor-specific IP offerings like Vitis AI from Xilinx or VectorBlox from Microchip. In this presentation, we will introduce our generic hardware accelerator IP named ENKI and the accompanying tools for optimization and quantization. We will highlight the advantages and strengths of our accelerator, and also share a series of experiments showcasing the on-board performance across a range of FPGA-based embedded platforms.
Deep learning (DL) models are being deployed to solve various computer vision and natural language processing tasks at the edge. Integrating NN on edge devices for IoT systems enables more efficient and responsive solutions, ushering in a new age of self-sustaining Edge AI. However, Deploying NN on resource-constrained edge devices presents a myriad of challenges:
1) The inherent complexity of neural network architectures, which requires significant computational and memory capabilities.
2) The limited power budget of IoT devices makes the NN inference prone to rapid energy depletion, drastically reducing system utility.
3) The hurdle of ensuring harmony between NN and HW designs as they evolve at different rates.
4) The lack of adaptability to the dynamic runtime environment and the intricacies of input data.
Hardware-aware Neural Architecture Search (HW-NAS) has recently gained steam by automating the design of efficient DL models for a variety of various target hardware platforms.
However, HW-NAS requires excessive computational resources. Thousands of GPU days are required to evaluate and explore an architecture search space. In this talk I will present state-of-the-art approaches for HW-NAS that are based on three components: i) Surrogate models to predict quickly architecture accuracy and hardware performances to speed up HW-NAS, ii) efficient multi-objective search algorithm that explores only promising hardware and software regions of the search space, and iii) New model compression techniques that can be combined with HW-NAS to reduce the processing and memory complexities such as computation reuse and dynamic NAS.
The neuromorphic approach to signal and image sensing has recently gained substantial attention as it proposes increasingly mature solutions to the energy issues encountered in deep network topologies with conventional technology of Artificial Intelligence. Brain-inspired spike encoding of information in neural networks enables both sparse activations and event-based computation, thus reducing the resulting computational complexity.
We explore in this presentation how to combine these traditional features with multi-level, or graded, spiking neurons in order to provide both low-quantization error and minimal inference latency on neuromorphic hardware. Increasing the information capacity of single spikes needs both adaptation of the integration and generation phases of the neural model, and optimization of the hardware to take advantage of the compression achieved during the training phase when making inferences on the continuous stream of spikes.
The experimental results obtained with popular datasets and network architectures, like VGG16 or ResNet18, show that multi-level spiking neurons provide better information compression and thus allow a reduction in latency without degrading performance, thus approaching full precision Artificial Neural Networks (ANNs) while significantly reducing the energy consumption on the SPLEAT neuromorphic accelerator developed in collaboration between LEAT and the startup AICO.
Developing energy-efficient AI systems requires the integration of on-chip learning within physical neural networks. Stochastic spintronic neurons, particularly superparamagnetic tunnel junctions (SMTJs), provide a promising hardware platform due to their thermally induced stochastic magnetization fluctuations, which inherently emulate binary stochastic neurons at the nanoscale. In this work, we present the first experimental demonstration of a binary classification task performed using a network of SMTJs. Our approach leverages the intrinsic stochasticity of SMTJs to process binary inputs and produce classification outcomes. The network architecture is designed to exploit the probabilistic switching behavior of SMTJs.To facilitate network learning, we implement a local rule inspired by Equilibrium Propagation, allowing us to avoid the energy-intensive data shuffling typically required in neural network training. By interfacing SMTJ arrays with electronic control circuits, we demonstrate the network’s ability to learn and classify binary patterns in real time. Our experimental results showcase the robust performance of SMTJ-based networks in performing binary classification tasks, highlighting the potential of spintronic neurons for implementing on-chip learning in hardware-based neural networks. This advancement paves the way for scalable, low-power neuromorphic and Ising-based systems capable of real-time processing and training.
Alors que les grands modèles « frontier » d’IA guident les évolutions des architectures de datacenters, les technologies photoniques se taillent une part de plus en plus importante dans les interconnexions entre nœuds de calcul, notamment avec l’émergence de technologies « co-packaged optics ». Cette tendance rapproche la photonique du cœur de calcul, avec de belles perspectives d’intégration photonique avancée dans la continuité de l’intégration 3D. En poussant le concept plus avant encore, la recherche amont s’intéresse maintenant au potentiel de ces technologies photoniques pour adresser le calcul lui-même en tirant parti des interférences cohérentes qu’on peut créer dans un sous-système optique. Cette présentation s’attachera à présenter ces différentes voies de recherche, leurs opportunités et leurs grands enjeux tant pour l’IA cloud que l’IA embarquée.
Les compte-rendus des journées thématiques, incluant notamment pour certaines les supports de présentation des orateurs ne seront disponibles sue pour les membres du GDR SoC2. Les informations sur la procédure d'inscription au GDR se trouvent sur la page Rejoidre le GDR.