z-logo
open-access-imgOpen Access
Toward Large-Scale Image Segmentation on Summit
Author(s) -
Sudip K. Seal,
Seung–Hwan Lim,
Dali Wang,
Jacob Hinkle,
Dalton Lunga,
Aristeidis Tsaris
Publication year - 2020
Publication title -
osti oai (u.s. department of energy office of scientific and technical information)
Language(s) - English
Resource type - Conference proceedings
DOI - 10.1145/3404397.3404468
Subject(s) - computer science , speedup , supercomputer , scalability , deep learning , parallel computing , benchmark (surveying) , artificial intelligence , artificial neural network , task (project management) , segmentation , computer engineering , management , geodesy , database , economics , geography
Semantic segmentation of images is an important computer vision task that emerges in a variety of application domains such as medical imaging, robotic vision and autonomous vehicles to name a few. While these domain-specific image analysis tasks involve relatively small image sizes (∼ 102 × 102), there are many applications that need to train machine learning models on image data with extents that are orders of magnitude larger (∼ 104 × 104). Training deep neural network (DNN) models on large extent images is extremely memory-intensive and often exceeds the memory limitations of a single graphical processing unit, a hardware accelerator of choice for computer vision workloads. Here, an efficient, sample parallel approach to train U-Net models on large extent image data sets is presented. Its advantages and limitations are analyzed and near-linear strong-scaling speedup demonstrated on 256 nodes (1536 GPUs) of the Summit supercomputer. Using a single node of the Summit supercomputer, an early evaluation of a recently released model parallel framework called GPipe is demonstrated to deliver ∼ 2X speedup in executing a U-Net model with an order of magnitude larger number of trainable parameters than reported before. Performance bottlenecks for pipelined training of U-Net models are identified and mitigation strategies to improve the speedups are discussed. Together, these results open up the possibility of combining both approaches into a unified scalable pipelined and data parallel algorithm to efficiently train U-Net models with very large receptive fields on data sets of ultra-large extent images.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom