z-logo
open-access-imgOpen Access
Data Cleaning Techniques for Large Data Sets
Author(s) -
Yogesh Bansal,
Anil K. Chopra
Publication year - 2020
Publication title -
international journal of recent technology and engineering
Language(s) - English
Resource type - Journals
ISSN - 2277-3878
DOI - 10.35940/ijrte.e6938.038620
Subject(s) - popularity , computer science , process (computing) , data set , data processing , data mining , set (abstract data type) , data extraction , data discovery , missing data , data science , knowledge extraction , big data , data warehouse , data analysis , information retrieval , database , artificial intelligence , machine learning , world wide web , metadata , psychology , social psychology , medline , political science , law , programming language , operating system
In today’s emerging era of data science where data plays a huge role for accurate decision making process it is very important to work on cleaned and irredundant data. As data is gathered from multiple sources it might contain anomalies, missing values etc. which needs to be removed this process is called data pre-processing. In this paper we perform data pre-processing on news popularity data set where extraction , transform and loading (ETL) is done .The outcome of the process is cleaned and refined news data set which can be used to do further analysis for knowledge discovery on popularity of news . Refined data give accurate predictions and can be better utilized in decision making process.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here