Open Access
a Schema Extraction of Document-Oriented Database for Data Warehouse
Author(s) -
A. Nurul Istiqamah,
Kemas Rahmat Saleh Wiharja
Publication year - 2021
Publication title -
international journal on information and communication technology
Language(s) - English
Resource type - Journals
ISSN - 2356-5462
DOI - 10.21108/ijoict.v7i2.584
Subject(s) - data warehouse , computer science , schema (genetic algorithms) , unstructured data , star schema , database , cloud computing , information retrieval , database schema , data extraction , popularity , dimensional modeling , conceptual schema , data science , data mining , big data , database design , psychology , social psychology , developmental psychology , medline , political science , gender schema theory , law , operating system
The data warehouse is a very famous solution for analyzing business data from heterogeneous sources. Unfortunately, a data warehouse only can analyze structured data. Whereas, nowadays, thanks to the popularity of social media and the ease of creating data on the web, we are experiencing a flood of unstructured data. Therefore, we need an approach that can "structure" the unstructured data into structured data that can be processed by the data warehouse. To do this, we propose a schema extraction approach using Google Cloud Platform that will create a schema from unstructured data. Based on our experiment, our approach successfully produces a schema from unstructured data. To the best of our knowledge, we are the first in using Google Cloud Platform for extracting a schema. We also prove that our approach helps the database developer to understand the unstructured data better.