
ON EVALUATING AI SYSTEMS FOR MEDICAL DIAGNOSIS
Author(s) -
Chandrasekaran B.
Publication year - 1983
Publication title -
ai magazine
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.597
H-Index - 79
eISSN - 2371-9621
pISSN - 0738-4602
DOI - 10.1609/aimag.v4i2.397
Subject(s) - artificial intelligence , fallacy , computer science , scale (ratio) , measure (data warehouse) , machine learning , data science , data mining , epistemology , geography , cartography , philosophy
Among the difficulties in evaluating AI‐type medical diagnosis systems are: the intermediate conclusions of the AI system need to be looked at in addition to the “final” answer; the “superhuman human” fallacy must be resisted; both pro– and anti–computer biases during evaluation must be guarded against; and methods for estimating how the approach will scale upwards to larger domains are needed We propose a type of Turing test for the evaluation problem, designed to provide some protection against the problems listed above We propose to measure both the accuracy of diagnosis and the structure of reasoning, the latter with a view to gauging how well the system will scale up