z-logo
open-access-imgOpen Access
DATA PREPARATION ON LARGE DATASETS FOR DATA SCIENCE
Author(s) -
Darshan Barapatre,
A. Vijayalakshmi
Publication year - 2017
Publication title -
asian journal of pharmaceutical and clinical research
Language(s) - English
Resource type - Journals
eISSN - 2455-3891
pISSN - 0974-2441
DOI - 10.22159/ajpcr.2017.v10s1.20526
Subject(s) - computer science , data science , structuring , profiling (computer programming) , task (project management) , process (computing) , flexibility (engineering) , data analysis , unstructured data , data mining , analytics , big data , engineering , systems engineering , statistics , mathematics , finance , economics , operating system
 According to interviews and experts, data scientists spend 50-80% of the valuable time in the mundane task of collecting and preparing structured or unstructured data, before it can be explored for useful analysis. It is very valuable for a data scientist to restructure and refine the data into more meaningful datasets, which can be used further for analytics. Hence, the idea is to build a tool which will contain all the required data preparation techniques to make data well-structured by providing greater flexibility and easy to use UI. Tool will contain different data preparation techniques which will include the process of data cleaning, data structuring, transforming data, data compression, and data profiling and implementation of related machine learning algorithms.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here