z-logo
open-access-imgOpen Access
Text Classification Using SVM Enhanced by Multithreading and CUDA
Author(s) -
Soumick Chatterjee,
Pramod George Jose,
Debabrata Datta
Publication year - 2019
Publication title -
international journal of modern education and computer science
Language(s) - English
Resource type - Journals
eISSN - 2075-017X
pISSN - 2075-0161
DOI - 10.5815/ijmecs.2019.01.02
Subject(s) - computer science , multithreading , support vector machine , preprocessor , cuda , process (computing) , kernel (algebra) , artificial intelligence , task (project management) , class (philosophy) , the internet , machine learning , parallelism (grammar) , information retrieval , data mining , natural language processing , parallel computing , world wide web , thread (computing) , programming language , mathematics , management , combinatorics , economics
With the sudden growth of the internet and digital documents available on the web, the task of organizing text data has become a major problem. In recent times, text classification has become one of the main techniques for organizing text data. The idea behind text classification is to classify a given piece of text to a predefined class or category. In the present research work, SVM has been used with linear kernel using the One-VRest strategy. The SVM is trained using various data sets collected from various sources. It may so happen that some particular words were not so common around 5-6 years ago, but are currently prevalent due to recent trends. Similarly, new discoveries may result in the coinage of new words. This process can also be applied to text blogs which can be crawled and then analyzed. This technique should in theory be able to classify blogs, tweets or any other document with a significant amount of accuracy. In any text classification process, preprocessing phase takes the most amount of time – cleaning, stemming, lemmatization etc. Hence, the authors have used a multithreading approach to speed up the process. The authors further tried to improve the processing time of the algorithm using GPU parallelism using CUDA.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom