Premium
Validation of algorithmic CT image quality metrics with preferences of radiologists
Author(s) -
Cheng Yuan,
Abadi Ehsan,
Smith Taylor Brunton,
Ria Francesco,
Meyer Mathias,
Marin Daniele,
Samei Ehsan
Publication year - 2019
Publication title -
medical physics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.473
H-Index - 180
eISSN - 2473-4209
pISSN - 0094-2405
DOI - 10.1002/mp.13795
Subject(s) - image quality , imaging phantom , metric (unit) , computer science , artificial intelligence , observer (physics) , medical imaging , computer vision , image noise , noise (video) , range (aeronautics) , data mining , image (mathematics) , medicine , radiology , operations management , physics , materials science , composite material , quantum mechanics , economics
Purpose Automated assessment of perceptual image quality on clinical Computed Tomography (CT) data by computer algorithms has the potential to greatly facilitate data‐driven monitoring and optimization of CT image acquisition protocols. The application of these techniques in clinical operation requires the knowledge of how the output of the computer algorithms corresponds to clinical expectations. This study addressed the need to validate algorithmic image quality measurements on clinical CT images with preferences of radiologists and determine the clinically acceptable range of algorithmic measurements for abdominal CT examinations. Materials and methods Algorithmic measurements of image quality metrics (organ HU, noise magnitude, and clarity) were performed on a clinical CT image dataset with supplemental measures of noise power spectrum from phantom images using techniques developed previously. The algorithmic measurements were compared to clinical expectations of image quality in an observer study with seven radiologists. Sets of CT liver images were selected from the dataset where images in the same set varied in terms of one metric at a time. These sets of images were shown via a web interface to one observer at a time. First, the observer rank ordered the CT images in a set according to his/her preference for the varying metric. The observer then selected his/her preferred acceptable range of the metric within the ranked images. The agreement between algorithmic and observer rankings of image quality were investigated and the clinically acceptable image quality in terms of algorithmic measurements were determined. Results The overall rank‐order agreements between algorithmic and observer assessments were 0.90, 0.98, and 1.00 for noise magnitude, liver parenchyma HU, and clarity, respectively. The results indicate a strong agreement between the algorithmic and observer assessments of image quality. Clinically acceptable thresholds (median) of algorithmic metric values were (17.8, 32.6) HU for noise magnitude, (92.1, 131.9) for liver parenchyma HU, and (0.47, 0.52) for clarity. Conclusions The observer study results indicated that these algorithms can robustly assess the perceptual quality of clinical CT images in an automated fashion. Clinically acceptable ranges of algorithmic measurements were determined. The correspondence of these image quality assessment algorithms to clinical expectations paves the way toward establishing diagnostic reference levels in terms of clinically acceptable perceptual image quality and data‐driven optimization of CT image acquisition protocols.