Operations | Monitoring | ITSM | DevOps | Cloud

Latest posts

Migrating From Your Tool to Squadcast

In our recent blog we talked about how having separate tools for On-Call and for alerting sucks! And how Squadcast offers a lifeline with its all-in-one Incident Management and Reliability Automation platform by amalgamating multiple tool functionality under a single hood. This blog is all about how you can easily transition from your current Incident Management & alerting tool into a better and more reliable enterprise grade platform with Squadcast.

Real-world Observability AI: An Interactive Chat with Logz.io IQ Assistant

There’s so much hype around the use of AI in observability — but how does that translate into making tangible progress with your day-to-day tasks? At Logz.io we’ve introduced an AI-based chatbot assistant to the Open 360 platform that automatically delves into your stack, fine-tunes your workflows and enables conversation directly with your systems and data.

Website Availability Monitoring

Website availability monitoring is checking your website regularly to make sure it is accessible and working for users at all times. This involves testing your site's uptime, which is the time your website is up and available, as well as its performance, such as loading speed and responsiveness. By monitoring your website from different locations around the world, you can get a view of how your site is performing for users in different regions.

APIs: The Silent Heroes of Data Center Management

Data centers typically operate a diverse array of systems including environmental controls, power management, IT service management (ITSM) platforms, and enterprise resource planning (ERP) systems. DCIM software with well-documented, open APIs ensures these systems can communicate and function cohesively. Interoperability fosters.

How the Financial Services sector is moving to the cloud, and what it means for monitoring

Redgate recently published the 2024 State of the Database Landscape report, which explores how the challenges for data professionals now encompass a lot more than managing and monitoring their database estates for high availability and optimum performance. Database DevOps, multiple database platforms, the cloud, AI, and making data available for development and testing have now also become part of the daily conversation.

How to use OpenTelemetry resource attributes and Grafana Cloud Application Observability to accelerate root cause analysis

Let’s imagine a scenario: you use OpenTelemetry, and your observability backend runs on several hosts. You collect data on application latency, and notice a recent increase that you want to investigate. But how will you know which host caused the degradation? This is exactly where OpenTelmetry resources come in. In the context of OpenTelemetry, a resource represents the entity producing the telemetry data, such as a container, host, process, service, or operating system.

Jaeger vs New Relic - Choosing Your Ideal Tool

If your application is as busy as a highway with multiple lanes, intersections, and exits, imagine trying to track the journey of a single car from start to finish. Sounds tricky, right? Well, that's what happens when you're dealing with modern, complex software systems. Enter distributed tracing, your trusty GPS for navigating the intricate web of microservices and dependencies within your applications.

Communicate scheduled maintenance with StatusIQ

Failure to communicate scheduled maintenance often results in unexpected downtime, significantly impacting the user experience by causing frustration and disrupting workflow. This not only leads to user confusion but also burdens IT support teams with a surge of customer queries. Gain deeper insights into effective strategies and best practices for communicating schedule maintenance activities clearly to stakeholders through this blog.