z-logo
Premium
An efficient secure k nearest neighbor classification protocol with high‐dimensional features
Author(s) -
Sun Maohua,
Yang Ruidi
Publication year - 2020
Publication title -
international journal of intelligent systems
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.291
H-Index - 87
eISSN - 1098-111X
pISSN - 0884-8173
DOI - 10.1002/int.22272
Subject(s) - computer science , homomorphic encryption , overhead (engineering) , feature (linguistics) , data mining , protocol (science) , k nearest neighbors algorithm , dimension (graph theory) , euclidean distance , secure multi party computation , feature vector , artificial intelligence , algorithm , computation , pattern recognition (psychology) , encryption , mathematics , medicine , philosophy , linguistics , alternative medicine , pathology , pure mathematics , operating system
k Nearest neighbor (kNN) classification algorithm is a prediction model which is widely used for real‐life applications, such as healthcare, finance, computer vision, personalization recommendation and precision marketing. The arrival of data explosion era results in the significant increase of feature dimension, which also makes for the increase of privacy concern over the available samples and unlabeled data in the applications of machine learning. In this paper, we present a secure low communication overhead kNN classification protocol that is able to deal with high‐dimensional features given in real numbers. First, to deal with feature values given in real numbers, we develop a specific data conversion algorithm, which is used in the chosen fully homomorphic scheme. This conversion algorithm is generic and applicable to other algorithms that need to handle real numbers using the fully homomorphic scheme. Second, we present a privacy‐preserving euclidean distance protocol (PPEDP), which works with the Euclidean distance computation between two points given in real numbers in a high‐dimensional space. Then, based on the novelty PPEDP and oblivious transfer, we propose a new classification approach, efficient secure kNN classification protocol, (ESkNN) with low communication overhead, which is appropriate for a sample set with high‐dimensional features and real number feature values. Moreover, we implement ESkNN in C++. Experimental results show that ESkNN is several orders of magnitude faster in performance than existing works, and scales up to 18 000 feature dimension in a memory limited environment.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here