
Extensive data set analysis & prediction using R
Author(s) -
Padmaja Grandhe,
Vishnu Priya Damarla,
Shaziya Mohammad
Publication year - 2019
Publication title -
journal of physics. conference series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.21
H-Index - 85
eISSN - 1742-6596
pISSN - 1742-6588
DOI - 10.1088/1742-6596/1228/1/012048
Subject(s) - decision tree , data set , computer science , predictive analytics , big data , set (abstract data type) , data mining , predictive modelling , process (computing) , random forest , data analysis , tree (set theory) , class (philosophy) , machine learning , data science , artificial intelligence , mathematics , mathematical analysis , programming language , operating system
Large volumes of data now available in online by several applications. Predictions about future events are difficult in case of Big data. Several applications where these predictions are required are Predicting conformation of waiting list seats in Railway reservations, prediction of some diseases based on health conditions of humans and prediction of students Grades in examination. In the sectors of medical, Railways, airlines and APSRTC fields predictive analysis is useful for taking prevention measures and for future planning. Predictive analytics is a process that comes under the data analysis. Using R we can predict Large data sets in faster manner. This paper predicts the survival of the passengers based on few factors. By considering Titanic data set analysis is performed. Based on the factors gender, class, and age survival of passengers is predicted. Decision Tree and random forest algorithms are used for prediction and for comparing the test data with trained data set.