#NeotysPAC 2020 - Luca Cavazzana

#NeotysPAC 2020 - Luca Cavazzana

Nov 2, 2020

Min $: Optimizing Spark Costs on AWS with AI-Powered Tuning

Big data performance tuning on cloud infrastructure involves complex trade-offs. We use AI techniques to identify the optimal ones.

In this session, we’ll showcase how we used AI-powered techniques to cut AWS Elastic Map Reduce costs required to run batch jobs on an Apache Spark big data implementation. The target application is a business intelligence application for the video-on-demand industry. The intervention resulted in cost savings of over 40%.
Performance tuning for big data frameworks can be challenging. The sheer number of parameters on different layers (i.e., Spark framework, JVM, YARN, etc.) and their interdependencies make predicting and optimizing performance immensely complex.

Running big data applications on the cloud adds further complexity, with even more options to find the optimal cluster configuration, such as instance family, size and number.

As a result, teams have to rely on vendor guidelines and generic rules-of-thumbs, which may lead to wasting the potential of an expensive cluster.

Our approach uses automation and AI techniques to iteratively identify optimal stack configurations regardless of its complexity. In this study, we tuned both Apache Spark parameters and EC2 cluster size, finding an optimal trade-off between resource allocation and execution time that minimizes the overall cost.