Premium
Validation of a decision support system for the cytodiagnosis of fine needle aspirates of the breast using a prospectively collected dataset from multiple observers in a working clinical environment
Author(s) -
Cross S. S.,
Stephenson T. J.,
Mohammed T.,
Harrison R. F.
Publication year - 2000
Publication title -
cytopathology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.512
H-Index - 48
eISSN - 1365-2303
pISSN - 0956-5507
DOI - 10.1046/j.1365-2303.2000.00290.x
Subject(s) - logistic regression , medicine , false positive paradox , observer (physics) , artificial intelligence , confidence interval , multilayer perceptron , test set , perceptron , predictive value , artificial neural network , statistics , computer science , mathematics , physics , quantum mechanics
Validation of a decision support system for the cytodiagnosis of fine needle aspirates of the breast using a prospectively collected dataset from multiple observers in a working clinical environment We have used a 692 case dataset, collected retrospectively by a single observer, to develop decision support systems for the cytodiagnosis of fine needle aspirates of breast lesions. In this study, we use a 322 case dataset that was prospectively collected by multiple observers in a working clinical environment to test two predictive systems, using logistic regression and the multilayer perceptron (MLP) type of neural network. Ten observed features and the patient age were used as input features. The systems were developed using a training set and test set from the single observer dataset and then applied to the multiple observer dataset. For the independent test cases from the single observer dataset, with a threshold set for no false positives on the training set, logistic regression produced a sensitivity of 82% (95% confidence interval 73–91) and a predictive value of a positive result (PV +) of 98% (95–99), the values for the MLP were 79% (69–89) and 100%, respectively. However the performance on the prospective multiple observer dataset was much worse, with a sensitivity of 72% (65–80), and PV + of 97% (94–99) for logistic regression and 67% (60–75) and 91% (85–97) for the MLP. These results suggest that there is considerable interobserver variability for the defined features and that this system is unsuitable for further development in the clinical environment unless this problem can be overcome.