3D Stacked HBM and Compute Accelerators for LLM: Optimizing Thermal Management and Power Delivery Efficiency | Zendy

Janak Sharda | Zendy; Madison Manley | Zendy; Jungyoun Kwak | Zendy; Chinsung Park | Zendy; Muhannad Bakir | Zendy; Shimeng Yu | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

3D Stacked HBM and Compute Accelerators for LLM: Optimizing Thermal Management and Power Delivery Efficiency

Author(s) -

Janak Sharda,

Madison Manley,

Jungyoun Kwak,

Chinsung Park,

Muhannad Bakir,

Shimeng Yu

Publication year - 2025

Publication title -

ieee journal on exploratory solid-state computational devices and circuits

Language(s) - English

Resource type - Magazines

SCImago Journal Rank - 0.545

H-Index - 16

eISSN - 2329-9231

DOI - 10.1109/jxcdc.2025.3617298

Subject(s) - components, circuits, devices and systems , computing and processing

Advanced packaging is becoming essential for designing hardware accelerators for large language models (LLMs). Different architectures such as 2.5D integration of memory with logic have been proposed, however the bandwidth limits the throughput of the complete system. Recent works have proposed memory on logic systems, where high bandwidth memory (HBM) can be 3D stacked on top of logic to improve the throughput by 64× and energy efficiency by 3×. However, the high-power consumption of logic dies and the high thermal resistance of HBM can result in thermal and power delivery challenges in such heterogeneously integrated stacks. In this work, we explore various design configurations such as logic-on-memory, and memory-on-logic, and consider some hybrid configurations. Further, accurate modeling of DRAM dies is performed, and mitigation strategies are proposed to further improve the throughput by 16% for memory-on-logic, reduce the IR drop for logic-on-memory system by 640 mV, and get 4× higher throughput for a hybrid system compared to the 2.5D integrated system.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research