Transforming statistical linked data for use in OLAP systems
Author(s) -
Benedikt Kämpgen,
Andreas Harth
Publication year - 2011
Publication title -
citeseer x (the pennsylvania state university)
Language(s) - English
Resource type - Conference proceedings
DOI - 10.1145/2063518.2063523
Subject(s) - online analytical processing , computer science , data warehouse , data mining , statistical model , visualization , data set , data modeling , pipeline (software) , set (abstract data type) , data visualization , database , data cube , statistical analysis , data science , information retrieval , machine learning , statistics , mathematics , artificial intelligence , programming language
The amount of available Linked Data on the Web is increasing, and data providers start to publish statistical datasets that comprise numerical data. Such statistical datasets differ significantly from the currently predominant network-style data published on the Web. We explore the possibility of integrating statistical data from multiple Linked Data sources. We provide a mapping from statistical Linked Data into the Multidimensional Model used in data warehouses. We use an extract-transform-load (ETL) pipeline to convert statistical Linked Data into a format suitable for loading into an open-source OLAP system, and thus demonstrate how standard OLAP infrastructure can be used for elaborate querying and visualisation of integrated statistical Linked Data. We discuss lessons learned from three experiments and identify areas which require future work to ultimately arrive at a well-interlinked set of statistical data from multiple sources which is processable with standard OLAP systems.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom