Efficient DNN Execution on Intermittently-Powered IoT Devices With Depth-First Inference | Zendy

Mingsong Lv | Zendy; Enyu Xu | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Efficient DNN Execution on Intermittently-Powered IoT Devices With Depth-First Inference

Author(s) -

Mingsong Lv,

Enyu Xu

Publication year - 2022

Publication title -

ieee access

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.587

H-Index - 127

ISSN - 2169-3536

DOI - 10.1109/access.2022.3203719

Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation

Program execution on intermittently powered Internet-of-Things (IoT) devices must ensure forward progress in the presence of frequent power failures. A general solution is intermittent computing, by which the program states are frequently checkpointed to non-volatile memory (NVM) so that once a power failure occurs, the program can restart from the latest checkpoint after the system energy regains. However, executing a deep neural network (DNN) inference program in an intermittent way has a big problem. During the execution, an inference program will generate large-volume feature maps (as part of the program states), and checkpointing the feature maps to NVM will incur significant time and energy overhead and thus reduce the inference efficiency. This paper proposes an approach to reduce the amount of feature map writing in intermittent DNN inference. The main idea is to partition the inference task into several slices and execute each slice in a depth-first manner so that the intermediate feature maps during the inference of each slice do not need to be written to NVM. Extensive experiments have been conducted, which show that the proposed approach can significantly reduce the amount of NVM writing, and a maximum of 1.965 speedup of the total inference time is achieved compared to the state-of-the-art approach.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore