z-logo
open-access-imgOpen Access
A multinomial logistic regression model for text in Albanian language
Author(s) -
Denisa Salillari,
Luela Prifti
Publication year - 2016
Publication title -
journal of advances in mathematics
Language(s) - English
Resource type - Journals
ISSN - 2347-1921
DOI - 10.24297/jam.v12i7.5486
Subject(s) - multinomial logistic regression , categorical variable , mathematics , logistic regression , multinomial distribution , statistics , variables , set (abstract data type) , variable (mathematics) , regression analysis , identification (biology) , computer science , mathematical analysis , botany , biology , programming language
In this paper we present a multinomial logistic regression model for authorship identification in the Albanian language texts. In the model fitted the dependent variable is categorical which takes different values from 1 to 10 for each of the author and the independent variables are number of words, number of letters, number of vowels, number of consonants, number of punctuations and number of sentences for each text. The model was applied with success in the set of ten authors, each of them being represented by a set of one hundred texts they authored. As results first, second and the third authors have the higher correct predicted percentage and the highest overall correct predicted probability taken was 0.738. As conclusion adding in the model number of consonants, number of punctuations and number of sentences as independent variables the overall correct predicted percentage is increased.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here