Identificação automática de unidades de informação em testes de reconto de narrativas usando métodos de similaridade semântica | Zendy

Leandro Dos Borges dos Santos | Zendy; Sandra Maria Aluísio | Zendy

Open Access

Identificação automática de unidades de informação em testes de reconto de narrativas usando métodos de similaridade semântica

Author(s) -

Leandro Dos Borges dos Santos,

Sandra Maria Aluísio

Publication year - 2020

Publication title -

linguamática

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.121

H-Index - 7

ISSN - 1647-0818

DOI - 10.21814/lm.11.2.304

Subject(s) - philosophy , gynecology , medicine

Diagnoses of Alzheimer's Disease (AD) and Mild Cognitive Impairment (CCL) are based on the analysis of the patient's cognitive functions by administering cognitive and neuropsychological assessment batteries. The use of retelling narratives is common to help identify and quantify the degree of dementia. In general, one point is awarded for each unit recalled, and the final score represents the number of units recalled. In this paper, we evaluated two clinical tasks: the automatic identification of which elements of a retold narrative were recalled; and the binary classification of the narrative produced by a patient, having the units identified as attributes, aiming at an automatic screening of patients with cognitive impairment. We used two transcribed retelling data sets in which sentences were divided and manually annotated with the information units. These data sets were then made publicly available. They are: the Arizona Battery for Communication and Dementia Disorders (ABCD) that contains narratives of patients with CCL and Healthy Controls and the Avaliacao da Linguagem no Envelhecimento (BALE), which includes narratives of patients with AD and CCLs as well as Healthy Controls. We evaluated two methods based on semantic similarity, referred to here as STS and Chunking, and transformed the multi-label problem of identifying elements of a retold narrative into binary classification problems, finding a cutoff point for the similarity value of each information unit. In this way, we were able to overcome two baselines for the two datasets in the SubsetAccuracy metric, which is the most punitive for the multi-label scenario. In binary classification, however, not all six machine learning methods evaluated performed better than the baselines methods. For ABCD, the best methods were Decision Trees and KNN, and for BALE, SVM with RBF kernel stood out.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research