A Novel Approach for Data Cleaning by Selecting the Optimal Data to Fill the Missing Values for Maintaining Reliable Data Warehouse | Zendy

Raju Dara | Zendy; Ch. Satyanarayana | Zendy; A. Govardhan | Zendy

Open Access

A Novel Approach for Data Cleaning by Selecting the Optimal Data to Fill the Missing Values for Maintaining Reliable Data Warehouse

Author(s) -

Raju Dara,

Ch. Satyanarayana,

A. Govardhan

Publication year - 2016

Publication title -

international journal of modern education and computer science

Language(s) - English

Resource type - Journals

eISSN - 2075-017X

pISSN - 2075-0161

DOI - 10.5815/ijmecs.2016.05.08

Subject(s) - computer science , byte , data warehouse , jaccard index , quality (philosophy) , data mining , cluster analysis , function (biology) , upgrade , data quality , information retrieval , pipeline (software) , haystack , data science , database , world wide web , artificial intelligence , metric (unit) , philosophy , operations management , epistemology , evolutionary biology , biology , economics , programming language , operating system

At present trillion of bytes of information is being created by projects particularly in web. To accomplish the best choice for business benefits, access to that information in a very much arranged and intuitive way is dependably a fantasy of business administrators and chiefs. Information warehouse is the main feasible arrangement that can bring the fantasy into reality. The upgrade of future attempts to settle on choices relies on upon the accessibility of right data that depends on nature of information basic. The quality information must be created by cleaning information preceding stacking into information distribution center following the information gathered from diverse sources will be grimy. Once the information have been pre-prepared and purified then it produces exact results on applying the information mining question. There are numerous cases where the data is sparse in nature. To get accurate results with sparse data is hard. In this paper the main goal is to fill the missing values in acquired data which is sparse in nature. Precisely caution must be taken to choose minimum number of text pieces to fill the holes for which we have used Jaccard Dissimilarity function for clustering the data which is frequent in nature.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research