z-logo
open-access-imgOpen Access
Corsair: An In-memory Computing Chiplet Architecture for Inference-time Compute Acceleration
Author(s) -
Satyam Srivastava,
Akhil Arunkumar,
Nithesh Kurella,
Amrit Panda,
Gaurav Jain,
Purushotham Kamath,
Mark Wutzke,
Arun Tiruvur,
Mike Gupta,
Ilya Soloveychik,
Vamsi Darsi,
Malav Dalal,
Vinayak Patankar,
Sasidhar Dudyala,
Senthil Duraisamy,
Santhosh Ramchandran,
Raghav Venkatasubramanian,
Yuwei Qin,
Xin Wang,
Jayaprakash Balachandran,
Ali Gok,
Piotr Wojciechowski,
Saliya Ekanayake,
Chris Ng,
Ranju Sarma,
Shubhankit Rathore,
Tristan Trouwen,
Siwei Zhuang,
Chris Nicol,
Sudeep Bhoja
Publication year - 2025
Publication title -
ieee micro
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 0.649
H-Index - 94
eISSN - 1937-4143
pISSN - 0272-1732
DOI - 10.1109/mm.2025.3593444
Subject(s) - computing and processing
Advances in Generative AI (GenAI) have reinvigorated research into novel computing architectures such as Transformer. Transformer, characterized by low arithmetic intensity during most of the inference time, has become the cornerstone of GenAI underlying Large Language (LLM) and Reasoning Models (RM). Numerous solutions to the intense memory bandwidth problem have been proposed. Corsair is an architecture that targets this need using chiplet design, digital in-memory computing-based matrix engine, efficient die-to-die interconnects, block floating point numerics, and large high-bandwidth on-chip memories. We describe the Corsair chiplet, scaling approaches to compose larger systems, and outline the software stack. We formulate the inference-time requirements of LLM and RM computation, memory bandwidth, memory capacity, and interconnect efficiency for scaling. We also show how Corsair design perfectly fits these workloads. We present benchmark results from Corsair silicon that correlate strongly with the design and preview an estimate of workload-level improvements expected with Corsair.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom