Premium
Optimization of reference library used in content‐based medical image retrieval scheme
Author(s) -
Park Sang Cheol,
Sukthankar Rahul,
Mummert Lily,
Satyanarayanan Mahadev,
Zheng Bin
Publication year - 2007
Publication title -
medical physics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.473
H-Index - 180
eISSN - 2473-4209
pISSN - 0094-2405
DOI - 10.1118/1.2795826
Subject(s) - cad , computer science , receiver operating characteristic , content based image retrieval , set (abstract data type) , image retrieval , scheme (mathematics) , information retrieval , pattern recognition (psychology) , image (mathematics) , artificial intelligence , data mining , mathematics , machine learning , engineering drawing , mathematical analysis , engineering , programming language
Building an optimal image reference library is a critical step in developing the interactive computer‐aided detection and diagnosis (I‐CAD) systems of medical images using content‐based image retrieval (CBIR) schemes. In this study, the authors conducted two experiments to investigate (1) the relationship between I‐CAD performance and size of reference library and (2) a new reference selection strategy to optimize the library and improve I‐CAD performance. The authors assembled a reference library that includes 3153 regions of interest (ROI) depicting either malignant masses (1592) or CAD‐cued false‐positive regions (1561) and an independent testing data set including 200 masses and 200 false‐positive regions. A CBIR scheme using a distance‐weighted K ‐nearest neighbor algorithm is applied to retrieve references that are considered similar to the testing sample from the library. The area under receiver operating characteristic curve( A z )is used as an index to evaluate the I‐CAD performance. In the first experiment, the authors systematically increased reference library size and tested I‐CAD performance. The result indicates that scheme performance improves initially fromA z = 0.715 to 0.874 and then plateaus when the library size reaches approximately half of its maximum capacity. In the second experiment, based on the hypothesis that a ROI should be removed if it performs poorly compared to a group of similar ROIs in a large and diverse reference library, the authors applied a new strategy to identify “poorly effective” references. By removing 174 identified ROIs from the reference library, I‐CAD performance significantly increases toA z = 0.914( p < 0.01 ) . The study demonstrates that increasing reference library size and removing poorly effective references can significantly improve I‐CAD performance.