Big Data Performance Management Solution Top Considerations

Big Data Performance Management Solution Top Considerations

The growing adoption of Hadoop and Spark has increased demand for Big Data and Performance Management solutions that operate at scale.

Learn why Enterprise clients use Pepperdata products and Services:

#bigdataperformancemanagement #applicationtuning #pepperdata

However, enterprise organizations quickly realize that scaling from pilot projects to large-scale production clusters involves a steep learning curve. Despite progress, DevOps teams still struggle with multi-tenancy, cluster performance, and workflow monitoring. This webinar discusses the top considerations when choosing a big data performance management solution.

In this webinar, field engineer Alex Pierce discusses the key things to consider when choosing a big data performance management solution. Learn how to:

– Maximize your infrastructure investment
– Achieve up to 50 percent increase in throughput, and run more jobs on existing infrastructure
– Ensure cluster stability and efficiency
– Avoid overspending on unnecessary hardware
– Spend less time in backlog queues

Learn how to automatically tune and optimize your cluster resources, and recapture wasted capacity. Alex will walkthrough use case examples to demonstrate the types of results you can expect to achieve in your own big data environment.

More on the episode:
So, let's go ahead and start it off. First of all, we're going to look a little bit at macro trends and big data performance. Clearly, we're seeing a very large uptick in cloud computing utilization, machine learning, including the addition of GPUs to some Spark workloads.

A lot more interest in data governance than we have seen in the past, which is very nice to see. And of course, everybody wants to go fast. That's just the way things are moving in this space. So, our first poll question: What are your strategies for increasing big data performance? This will be upon your end for a little bit, so please answer if you can.

I will be reading each question as we get it. So, when you're determining what is the best performance management solution for your big data environment, what do you need to consider? What are the challenges that are leading us to do this? One of the big ones - lack of automation.

A lot of manual work goes into running and maintaining a big data solution. Homegrown solutions, well they seem like a good idea at the time. You have to worry about an employee leaving, supportability, scalability, cloud support, a lot of companies are starting to move from legacy on-premise systems into the cloud.

Does your performance management solution move with you? Silo solutions, You really want to understand what's going on in the big picture, not just a solution for your Spark, a solution for your SQL, a solution for your Hadoop. You really need something that covers all the vendors you're dealing with in your environment. And multi-platform monitoring is complex. I mean, there's a lot going on there.

You need tooling that supports many different things. And how much training is required to install and support your solution? So, these are the challenges we're going to talk a little bit about, what you need to consider for each of these. So, we're going to start with where are you on the path to big data performance management? Are you early? You have a couple single-use cases you're growing? You may have multiple use cases.

And now, you're starting to ask about scaling and finding out how things are breaking. And the favorite question: Can I tune my way out of this? And then, mature companies, we’re talking multiple business units, principal SLAs, some sort of data hub or data lake solution. Now, we need to think about “Okay. What is my growth rate? How do I pay for this? Why can't my vendor-provided provisioning tools also support monitoring the scale I need?”

So, part of this, you’ve got to ask the right questions. What does your tool need to support you, need to support multi-tenancy? That's one of the biggest things in big data. You are going to have multiple users running multiple types of workloads, and multiple workloads. In general performance optimization, you can't do this alone. You need someone to help you make it work better, workflow monitoring...

Learn why Enterprise clients use Pepperdata products and Services:

Check out our blog:


Connect with us:
Visit Pepperdata Website:
Follow Pepperdata on LinkedIn:
Follow Pepperdata on Twitter:
Like Pepperdata on Facebook: