As artificial intelligence (AI) workloads continue to grow in compute and memory demands, novel computing architectures are essential to deliver high throughput at low energy cost. Analog in-memory computing (AIMC) has emerged as a promising paradigm for efficient data processing at the edge, where constraints on area, cost, and energy are critical.

In this talk, I will introduce the principles and technologies behind AIMC, highlighting memory options ranging from volatile to non-volatile devices. I will present recent integrated test chips developed by my group, which leverage high-density phase-change memory (PCM) and employ careful design–technology co-optimization (DTCO) to achieve both energy efficiency and the inference accuracy required by convolutional neural networks (CNNs). Finally, I will extend the discussion to transformer-based large language models (LLMs), which rely on matrix–matrix multiplications with rapidly changing data and activations. I will outline architectural opportunities for accelerating LLMs through the use of dynamic random-access memory (DRAM) and advanced 3D transistor arrays.

September 15 @ 11:30
11:30 — 12:00 (30′)

Prof. Daniele Ielmini (Politecnico di Milano – IT)