Influence of Data Discretization on Efficiency of Bayesian Classifier for Authorship Attribution
Author(s) -
Grzegorz Baron
Publication year - 2014
Publication title -
procedia computer science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.334
H-Index - 76
ISSN - 1877-0509
DOI - 10.1016/j.procs.2014.08.201
Subject(s) - discretization , computer science , naive bayes classifier , data mining , classifier (uml) , artificial intelligence , machine learning , binary classification , bayesian probability , authorship attribution , bayes classifier , bayes' theorem , pattern recognition (psychology) , support vector machine , mathematics , mathematical analysis
Authorship attribution is one of the research areas in data mining domain and various methods can be employed for performing that task. The paper presents results of research on influence of data discretization on efficiency of Naive Bayes classifier. The analysis has been carried on datasets founded on texts of two male and two female authors using the WEKA data mining software framework. The binary classification was performed separately for both datasets for wide range of parameters of discretization process in order to investigate dependency between ways of discretization and quality of classification using Naive Bayes method. The numerical results of tests have been compared and discussed and some observations and conclusions formulated
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom