Content-based prediction: big data sampling perspective | Zendy

Waleed Albattah | Zendy; Saleh Albahli | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Content-based prediction: big data sampling perspective

Author(s) -

Waleed Albattah,

Saleh Albahli

Publication year - 2019

Publication title -

international journal of engineering and technology

Language(s) - English

Resource type - Journals

ISSN - 2227-524X

DOI - 10.14419/ijet.v8i4.30150

Subject(s) - terabyte , petabyte , computer science , sampling (signal processing) , machine learning , big data , support vector machine , artificial intelligence , data mining , process (computing) , random forest , perspective (graphical) , filter (signal processing) , computer vision , operating system

Today, large volumes of data are actively generated on the order of terabytes or even petabytes. Hence, processing data on such a large scale in an efficient and effective manner is extremely challenging. However, existing research studies apply machine learning algorithms by loading the entire training dataset into the computer’s main memory (RAM). This causes a problem as the data grows too big over time and can’t be supported by most of the conventional models or hardware within a single machine’s memory. Inspired by current research studies, this paper discusses the benefits of implementing two sampling techniques that could be used for machine learning models: (1) sampling with replacement and (2) reservoir sampling. In this study, 40 experiments were performed by reducing the number of data instances by 50% of the original data using random sampling of a video dataset that was more than 40 GB in size. Remark that accuracies of SVM and random forest are very competitive classifiers and give the importance score of all repeated ten rounds of the process for each of the four combinations of sampling techniques and machine learning classifiers.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore