z-logo
open-access-imgOpen Access
An Enhanced Performance of K-Nearest Neighbor (K-NN) Classifier to Meet New Big Data Necessities
Author(s) -
Ihab L. Hussein Alsammak,
Humam M. Abdul Sahib,
Wasan H. Itwee
Publication year - 2020
Publication title -
iop conference series. materials science and engineering
Language(s) - English
Resource type - Journals
eISSN - 1757-899X
pISSN - 1757-8981
DOI - 10.1088/1757-899x/928/3/032013
Subject(s) - computer science , data mining , big data , k nearest neighbors algorithm , logarithm , classifier (uml) , computation , algorithm , artificial intelligence , pattern recognition (psychology) , machine learning , mathematics , mathematical analysis
The rapid increase in the growth of text information over the past two decades has led to the need for the use of text classification techniques, particularly in the area of information retrieval, data mining and data management. The precise results and simplicity of the K-Nearest Neighbor Classification Algorithm (K-NN) in knowledge mining is the reason that made it one of the most important classification algorithms used in many tasks such as pattern recognition, regression, and text classification. Through experiments and analysis of the results of the use of the traditional algorithm of the (K-NN), there are some deficiencies in their performance, especially when the data are large such as the algorithm was unable to process big data by rapid extraction with minimal storage space and generate useless samples computation and probability problems. In this paper, we have developed an enhanced algorithm and get the best results and perform better than that in the traditional algorithm. The significant improvement in our model performance is due to the improvement by removing unnecessary computational samples in the traditional algorithm. The performance is further improved by using the lost value computational method to define results as a prelude to avoid wasting time by correcting and filtering noise, examining the database, and eliminating unwanted records. Additionally, the inverse logarithmic function was used to solve the probability problems the algorithm encounters. The experimental results showed the efficiency of the modified algorithm in reducing the sample size and speeding up the search for the required data.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here