Capacity Optimizer with Managed Autoscaling

Capacity Optimizer with Managed Autoscaling

Capacity Optimizer is an automated solution that continuously improves the performance of analytics clusters without manual intervention or application tuning.

Learn more about Pepperdata Capacity Optimizer: https://www.pepperdata.com/product/capacity-optimizer/

CapacityOptimizer #ManagedAutoscaling #Pepperdata

Essentially what you get is patented machine learning that is automatically tuning your analytics platform in a way that allows more work to get done over time with the existing hardware. The tile on the right of the screen is showing the number of tasks that are being run because the Capacity Optimizer is applying intelligent resource management to the platform. Typically what we see at the enterprise level and with very busy clusters or even grow in clusters is that you get a 30 to 50% increase in the throughput of a platform.

The Capacity Optimizer is an automated solution that continuously improves the performance of analytics clusters without manual intervention or application tuning. Essentially what you get is patented machine learning that is automatically tuning your analytics platform in a way that allows more work to get done over time with the existing hardware.

The tile on the right of the screen is showing the number of tasks that are being run because the capacity optimizer is applying intelligent resource management to the platform. Typically what we see at the enterprise level and with very busy clusters, or with even growing clusters, is that you get a 30 to 50% increase in the throughput of a platform.

Meaning the nodes are able to do 30 to 50% more work than they could without Capacity Optimizer. With this automatic tuning, what we're doing is we're managing the number of cores and the amount of memory available to the scheduler in real-time.

And we're doing this so that we can free up resources that would be wasted by inefficient applications. If, for example, an application asks for 100 cores but only uses 20 cores the scheduler can't leverage those 80 cores without, you know, some sort of intervention. You either have to retune the app, or you have to change the base allocations on the nodes.

So what we're doing with capacity optimizer is automatically viewing every running application and informing the resource manager and schedulers what can be used despite what's been scheduled. So, essentially what we're doing is we're saying here's a view of the actual hardware. And you can look around the reservation system because that's not really telling you what's being used.

So, with a capacity optimizer, the nodes in the example on screen would roughly be roughly wasting about forty percent of the RAM at peak given the inefficiency of the applications in the platform. And this is a platform from a very large-scale environment running hundreds of nodes. And this is not atypical when you look at the way things work in these environments: you've got thousands of applications, and no one's got time to tune all those applications.

So, the base allocation at the bottom for these nodes is about 60 gig. And then, with Capacity Optimizer we're able to automatically increment that up and down so that the resource manager and the scheduler can get to the actual hardware and move that reservation system as needed to mitigate those wasteful applications. With this feature, we're looking at a number of things to maintain safety in the environment and not overrun actual resources.

We have memory thresholds, we have CPU thresholds, and IO thresholds, so that you know, we don't see that a node has, say, available memory and available CPU and adds more nodes when the IO subsystem is saturated. We're also doing automatic swap detection as well as measuring how much swap is because you know some swap is okay.

Measuring how much swap is taking place and triggering a pullback when things get to be extreme in the case of the swap, and turning off the feature when need be so that we're not overrunning the resources, even if there's something outside of what you would typically run in your analytics workloads. We're able to see everything running on the platform and maintain safe thresholds with these thresholds that we have in place.

The ROI that you get from the Capacity Optimizer can be measured in real dollars. This is an example from a cluster that is running over a thousand nodes. This 1.8 million in annual savings is how much money would this shop have to spend to get to that memory that would have been wasted....

Check out our blog: https://www.pepperdata.com/blog/

/////////////////////////////////////////////////////////////////////////////////////////
Connect with us:
Visit Pepperdata Website: https://www.pepperdata.com/
Follow Pepperdata on LinkedIn: https://www.linkedin.com/company/pepperdata
Follow Pepperdata on Twitter: https://twitter.com/pepperdata
Like Pepperdata on Facebook: https://www.facebook.com/pepperdata/