
A Dockerized big data architecture for sports analytics
Author(s) -
Yavuz Melih Özgüven,
Utku Gönener,
Süleyman Eken
Publication year - 2022
Publication title -
computer science and information systems
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.244
H-Index - 24
eISSN - 2406-1018
pISSN - 1820-0214
DOI - 10.2298/csis220118010o
Subject(s) - big data , computer science , spark (programming language) , analytics , data science , architecture , data analysis , data mining , art , visual arts , programming language
The big data revolution has had an impact on sports analytics as well. Many large corporations have begun to see the financial benefits of integrating sports analytics with big data. When we rely on central processing systems to aggregate and analyze large amounts of sport data from many sources, we compromise the accuracy and timeliness of the data. As a response to these issues, distributed systems come to the rescue, and the MapReduce paradigm holds promise for large scale data analytics. We describe a big data architecture based on Docker containers with Apache Spark in this paper. We evaluate the architecture on four data-intensive case studies in sport analytics including structured analysis, streaming, machine learning approaches, and graph-based analysis.