Premium
Dilated conditional GAN for bone suppression in chest radiographs with enforced semantic features
Author(s) -
Zhou Zhizhen,
Zhou Luping,
Shen Kaikai
Publication year - 2020
Publication title -
medical physics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.473
H-Index - 180
eISSN - 2473-4209
pISSN - 0094-2405
DOI - 10.1002/mp.14371
Subject(s) - discriminator , computer science , upsampling , artificial intelligence , feature (linguistics) , benchmark (surveying) , similarity (geometry) , generator (circuit theory) , shadow (psychology) , chest radiograph , radiography , pattern recognition (psychology) , computer vision , image (mathematics) , radiology , medicine , psychology , telecommunications , linguistics , philosophy , power (physics) , physics , geodesy , quantum mechanics , detector , psychotherapist , geography
Purpose The purpose of this essay is to improve computer‐aided diagnosis of lung diseases by the removal of bone structures imagery such as ribs and clavicles, which may shadow a clinical view of lesions. This paper aims to develop an algorithm to suppress the imaging of bone structures within clinical x‐ray images, leaving a residual portrayal of lung tissue; such that these images can be used to better serve applications, such as lung nodule detection or pathology based on the radiological reading of chest x rays. Methods We propose a conditional Adversarial Generative Network (cGAN) (Mirza and Osindero, Conditional generative adversarial nets, 2014.) model, consisting of a generator and a discriminator, for the task of bone shadow suppression. The proposed model utilizes convolutional operations to expand the size of the receptive field of the generator without losing contextual information while downsampling the image. It is trained by enforcing both the pixel‐wise intensity similarity and the semantic‐level visual similarity between the generated x‐ray images and the ground truth, via optimizing an L‐1 loss of the pixel intensity values on the generator side and a feature matching loss on the discriminator side, respectively. Results The framework we propose is trained and tested on an open‐access chest radiograph dataset for benchmark. Results show that our model is capable of generating bone‐suppressed images of outstanding quality with a limited number of training samples (N = 272). Conclusions Our approach outperforms current state‐of‐the‐art bone suppression methods using x‐ray images. Instead of simply downsampling images at different scales, our proposed method mitigates the loss of contextual information by utilizing dilated convolutions, which gains a noticeable quality improvement for the outputs. On the other hand, our experiment shows that enforcing the semantic similarity between the generated and the ground truth images assists the adversarial training process and achieves better perceptual quality.