Performance Evaluation of Map Reduce vs. Spark framework on Amazon Machine Image for TeraSort Algorithm | Zendy

Gangadhara Rao Kommu | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Performance Evaluation of Map Reduce vs. Spark framework on Amazon Machine Image for TeraSort Algorithm

Author(s) -

Gangadhara Rao Kommu

Publication year - 2021

Publication title -

international journal for research in applied science and engineering technology

Language(s) - English

Resource type - Journals

ISSN - 2321-9653

DOI - 10.22214/ijraset.2021.35540

Subject(s) - speedup , computer science , spark (programming language) , sorting , java , implementation , generator (circuit theory) , parallel computing , algorithm , sorting algorithm , data mining , operating system , power (physics) , physics , quantum mechanics , programming language

TeraSort is one of Hadoop’s widely used benchmarks. Hadoop’s distribution contains both the input generator and sorting implementations: the TeraGen generates the input and TeraSort conducts the sorting. We focus on the comparison of TeraSort algorithm on the different distributed platforms with different configurations of the resources. We have considered the parameters of measure of efficiency as Compute Time, Data Read, Data Write, Compute Time, and Speedup. We have conducted experiments using Hadoop map reduce and Spark (Java). We empirically evaluate the performance of TeraSort algorithm on Amazon EC2 Machine Images, and demonstrate that it achieves 3.95 × - 2.4 × speedup, compared with TeraSort, for typical settings of interest.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore