Operations | Monitoring | ITSM | DevOps | Cloud

Latest posts

Synced for Success: OnPage & Slack for Incident Response

As the post-pandemic world finds its footing again, a resilient spirit drives the revival, propelling businesses to embrace a new era of technological innovation. Notably, IT teams are swiftly adopting the digital transformation of their processes, particularly in incident response. From virtual collaboration tools and remote IT support to automated incident management, teams have found innovative ways to ensure seamless business continuity while delivering IT services with minimum downtimes.

Creating a Location Based Business Service

Location-based business services allow customers to manage their site’s infrastructure devices very easily and monitor the Health, Availability and Risk of each service in a single pane of glass. Automating the creation and updating of the schema for location-based business services can save time and cost. This video explains how this new solution can help you to create/update your location-based business service.

Elastic APM - Automatic .NET Instrumentation with OpenTelemetry

Check out this YouTube video on Elastic Application Performance Monitoring (APM) and its integration with OpenTelemetry for.NET! In this informative and practical tutorial, we delve into the world of APM and demonstrate how to effectively instrument your.NET applications using OpenTelemetry with Elastic APM. Additional Resources: Connect with us on social media.

SOLD IN 6 SECONDS: Formulas to Fast & Friction-Free E-commerce

Today, the modern Internet is one that we rely upon for all things: from purchasing a $3 coffee to buying brand new shiny laptops. There are many moving parts in the modern web, each of which can create a chaotic experience. With back-to-school season around the corner, Sold in 6 Seconds is a discussion about what makes a smooth and successful e-commerce experience. The curriculum? Watch and learn the formula to ensure your conversion rates pass the test.

Managing Prometheus cardinality in Grafana Cloud: Adaptive Metrics FAQ

One of the most talked about topics in observability today is centered around the question of how to get more value out of the ever-increasing amount of data collected by agents, collectors, scrapers, and the like. Back in May, we announced Adaptive Metrics, a new feature in Grafana Cloud that allows you to reduce the cardinality of Prometheus metrics and the overall volume and costs of your metrics.

Scaling Up to Keep Costs Down: Automation for Web Application Incident Management

Any organization that’s keeping up with today’s sharp rise in business demands (or better yet, getting ahead of the game) is doing so by getting innovative and jumping at the chance to do things differently. They’re not relying on the old ways or trying to use their existing toolbox. Instead, organizations are looking to the newest technologies and means of adding efficiency to as many day-to-day functions as possible.

Reliability Best Practices: How Gremlin Uses Gremlin

Ensuring software availability is essential for any SaaS company—including Gremlin. To do that, our teams need to identify the reliability risks hiding in our systems. That’s why our development, platform, and SRE teams use Gremlin regularly to perform Chaos Engineering experiments, run reliability tests, and track the reliability of our systems against our standards. Along the way they’ve picked up a thing or two about how to find and fix reliability risks with Gremlin.

Behind the Scenes: Mattermost OpenOps AI Mindmeld | July 27, 2023

Tune in for a behind-the-scenes discussion on the advancement of Mattermost's AI tools and how they're being integrated into the team's current projects. The main topics covered include using AI to create tweets, the potential of using the tool to auto-generate text that resembles a user's tone, how to improve public awareness and involvement in OpenOps, and more.