z-logo
open-access-imgOpen Access
THE USE OF SYNTHETIC IMAGES FOR SOLVING THE CLASSIFICATION PROBLEM BY THE EXAMPLE OF LUNG CANCER DIAGNOSIS
Author(s) -
И. А. Гундырев,
Гундырев Иван Анатольевич,
Lyudmila V. Bel’skaya,
Бельская Людмила Владимировна,
Victor K. Kosenok,
Косенок Виктор Константинович,
Elena A. Sarf,
Сарф Елена Александровна
Publication year - 2018
Publication title -
vestnik rossijskoj akademii medicinskih nauk
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.122
H-Index - 15
eISSN - 2414-3545
pISSN - 0869-6047
DOI - 10.15690/vramn946
Subject(s) - classifier (uml) , artificial intelligence , random forest , lung cancer , cut point , pattern recognition (psychology) , diagnostic model , mathematics , machine learning , computer science , medicine , statistics , data mining , pathology
Background: From a mathematical point of view, the problems of medical diagnostics are the tasks of data classification. It is important to understand how significant distortions can contribute to the result of classification errors in the collection of primary diagnostic information, in particular, the results of biochemical tests.Aims: Determination of the dependence of the prediction result on the variability of the primary diagnostic information on the example of the model classifier.Materials and methods: The case-control study enrolled patients who were divided into 2 groups: the main (diagnosed with lung cancer, n=200) and the control group (conditionally healthy, n=500). Questioning and biochemical saliva study was performed in all participants. Patients of the main group and the comparison group were hospitalized for surgical treatment, after which carried out the histological verification of the diagnosis. The biochemical composition of saliva is determined spectrophotometrically. Based on the data obtained, a model classifier for the diagnosis of lung cancer (a random forest) has been constructed. In each parameter underlying the classifier, deviations were made in the specified range (±1–5%, ±5–10%, ±10–15%), creating synthetic images. Then, the results of the classification were evaluated by the cross-validation method.Results: The basic diagnostic characteristics of the model classifier are determined (sensitivity ― 72.5%, specificity ― 86.0%). As the deviations of synthetic images from the baseline increase, diagnostic characteristics deteriorate with the general classification. However, the result of a confident classification, on the contrary, gives higher values (sensitivity ― 81.8%, specificity ― 93.1%). In case of a confident classification, similar images that fall into different classes according to the classification results are deleted, whereas in the case of a general classification, they are taken into account. The difference between methods of classification is associated with the presence of images on which the classifier gives the result of belonging to the class in the range of 0.45–0.55. Therefore, it is necessary to introduce a third class into the classifier, the so-called gray zone (0.4–0.6), since the probability of making an erroneous diagnosis in this area is significantly increased.Conclusions: The obtained results allow to conclude that the measurement error in the range (±1–15%) does not significantly affect the quality of the classification.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here