Premium
Discovering anomalous patterns in large digital pathology images
Author(s) -
Somanchi Sriram,
Neill Daniel B.,
Parwani Anil V.
Publication year - 2018
Publication title -
statistics in medicine
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.996
H-Index - 183
eISSN - 1097-0258
pISSN - 0277-6715
DOI - 10.1002/sim.7828
Subject(s) - digital pathology , computer science , virtual microscopy , pixel , artificial intelligence , pattern recognition (psychology) , digital image , precision and recall , scale (ratio) , computer vision , image processing , image (mathematics) , pathology , cartography , medicine , geography
Advances in medical imaging technology have created opportunities for computer‐aided diagnostic tools to assist human practitioners in identifying relevant patterns in massive, multiscale digital pathology slides. This work presents Hierarchical Linear Time Subset Scanning, a novel statistical method for pattern detection. Hierarchical Linear Time Subset Scanning exploits the hierarchical structure inherent in data produced through virtual microscopy in order to accurately and quickly identify regions of interest for pathologists to review. We take a digital image at various resolution levels, identify the most anomalous regions at a coarse level, and continue to analyze the data at increasingly granular resolutions until we accurately identify its most anomalous subregions. We demonstrate the performance of our novel method in identifying cancerous locations on digital slides of prostate biopsy samples and show that our methods detect regions of cancer in minutes with high accuracy, both as measured by the ROC curve (measuring ability to distinguish between benign and cancerous slides) and by the spatial precision‐recall curve (measuring ability to pick out the malignant areas on a slide which contains cancer). Existing methods need small scale images (small areas of a slide preselected by the pathologist for analysis, eg, 32 × 32 pixels) and may not work effectively on large, raw digitized images of size 100 K × 100 K pixels. In this work, we provide a methodology to fill this significant gap by analyzing large digitized images and identifying regions of interest that may be indicative of cancer.