Averaging versus voting: A comparative study of strategies for distributed classification
Author(s) -
Donglin Wang,
Honglan Xu,
Qiang Wu
Publication year - 2020
Publication title -
mathematical foundations of computing
Language(s) - English
Resource type - Journals
ISSN - 2577-8838
DOI - 10.3934/mfc.2020017
Subject(s) - voting , computer science , divide and conquer algorithms , majority rule , set (abstract data type) , data mining , big data , machine learning , data set , artificial intelligence , algorithm , politics , political science , law , programming language
In this paper we proposed two strategies, averaging and voting, to implement distributed classification via the divide and conquer approach. When a data set is too big to be processed by one processor or is naturally stored in different locations, the method partitions the whole data into multiple subsets randomly or according to their locations. Then a base classification algorithm is applied to each subset to produce a local classification model. Finally, averaging or voting is used to couple the local models together to produce the final classification model. We performed thorough empirical studies to compare the two strategies. The results show that averaging is more effective in most scenarios.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom