New OverOps Reliability Dashboards Deepen Code-Level Visibility Across the Entire Software Delivery Life Cycle

Time to read
3 minutes
Read so far

New OverOps Reliability Dashboards Deepen Code-Level Visibility Across the Entire Software Delivery Life Cycle

Feb 5, 2019
0 comments

SAN FRANCISCO – February 5, 2019 – OverOps, the software reliability platform, today debuted its new Reliability Scoring capabilities that deepen enterprise visibility across both pre-production and production environments, and enable them to automatically identify and prioritize anomalies prior to a release in order to avoid promotion of bad code.

Using machine learning-based scoring, OverOps now provides an easily consumable view of environment reliability at every point in the software delivery life cycle, with the ability to click straight into the True Root Cause of a given issue. By analyzing the number of new errors, regressions and slowdowns introduced in each phase of pre-production and prioritizing those with critical impact, OverOps can accurately assess the reliability of a deployment compared to previous releases, integrate with CI/CD tooling and create quality gates that automatically block unstable releases.

"Most organizations are facing two primary dilemmas in their software delivery: 'how do I know if a release is ready to move forward, and once it has, how do I know how well it's doing?' Even with common testing and monitoring tools in place, there's still a large degree of uncertainty once code is released into the wild," said Tal Weiss, CTO and co-founder at OverOps. "OverOps now arms our customers with concrete data in an easily digestible format to validate the quality of any code or infrastructure change to an environment. Expanding on our flagship root cause analysis capabilities, we now not only help you to find and fix production errors fast, but we actually help stop them from making it there in the first place!"

As the pace of software delivery quickens, the risk of poor application quality increases. OverOps helps development, QA and operations teams balance this speed-stability paradox by leveraging its unique real estate inside an application to detect anomalies that would otherwise go unnoticed by existing monitoring tools. Utilizing micro-agents that operate between application code and the hardware, OverOps can capture data that was never previously available, then deduplicate, classify and gate critical issues from moving into staging and production. The new Reliability Dashboards help organizations visualize this data in Grafana out-of-the-box or via any tool of their choice via open REST APIs, so they can see at a glance where the issues are and drill in deeper into the cause with one click. Executives can see instantly if the quality of code is improving across teams and identify areas of weakness that require additional attention or resources.

Highlights of the new OverOps Reliability Dashboards include:

Reliability Scorecards and Release Certification

The OverOps Reliability Scorecard allows DevOps teams to observe the reliability of their environment at the highest level and triage critical issues that need attention. Each deployment, application and infrastructure tier is assigned its own dynamic score derived from the detection, classification and prioritization of all anomalies – including newly introduced errors, increasing errors and performance slowdowns.

Using these scores, organizations can certify releases to be moved through their delivery pipeline, or stop them in their tracks to proactively fix any issues. Through the new Jenkins integration, QA teams can see all new anomalies introduced by any release in test or stage, and automatically assign it a severity based on its potential impact to the code. OverOps will certify each release based on how many issues it introduced, and can automatically stop a bad release from being promoted, if it's risky, sending it back to the engineers with True Root Cause.

True Root Cause Drill-Down

From the Reliability Scorecard dashboard, users can drill into the details of low scoring deployments, applications or infrastructure tiers, like AWS or RDBMs. The Reliability Analysis dashboard shows the corresponding anomalies and allows users to click straight into the True Root Cause dashboard using OverOps' ARC AI technology, where they can view the code and variable state at the moment of an error across the entire call stack, as well environment state and DEBUG-level statements. With this complete context, QA, DevOps and SRE teams can easily route issues back to the right developer, arming them with all the context needed to fix the error – programmatic or operational.

Reliability Trends Over Time

The OverOps Reliability Trends dashboard provides a simple and effective way of comparing releases, or two instances of an application running on different nodes, to identify patterns. Building on this capability, the dashboard provides executives with an easy way to see how well their applications and deployments are doing over time with respect to error volume, unique error count, newly introduced or increasing errors, and slowdowns. At a glance, VPs and CXOs can see which applications are falling behind, as well as which teams require more attention. By understanding application quality release over release, executives can make informed decisions about resources and protect application revenue and customer experience.

General Availability

OverOps Reliability Dashboards are immediately available. For information on pricing, visit https://www.overops.com/pricing.

Additional Resources

  • Read our blog post about the four quality gates every SRE team must check before promoting code.

  • Sign up to attend a webinar detailing how OverOps helps at every point in the software delivery life cycle.

  • Watch a live demo to see the OverOps Platform in action.

About OverOps

OverOps captures code-level insight about application quality in real time to help DevOps teams deliver reliable software. Operating in any environment, OverOps employs both static and dynamic code analysis to collect unique data about every error and exception – both caught and uncaught – as well as performance slowdowns. This deep visibility into an application's functional quality not only helps developers more effectively identify the true root cause of an issue, but also empowers ITOps to detect anomalies and improve overall reliability. As more organizations aim to innovate faster and deliver a seamless digital experience for their customers, OverOps helps avoid costly downtime that can lead to lost revenue and brand degradation. Backed by Lightspeed Venture Partners and Menlo Ventures, OverOps enterprise customers include Comcast, TripAdvisor and Intuit. The company has offices in San Francisco and Tel Aviv.