Operations | Monitoring | ITSM | DevOps | Cloud

Does Observability Throw You for a Loop? Part Two: Close with Controllability

In part one, we introduced the duality of observability, controllability. As a reminder, observability is the ability to infer the internal state of a "machine” from externally exposed signals. Controllability is the ability to control input to direct the internal state to the desired outcome. So observability is a loop problem. And we need to stop treating it as the end state of our challenge in delivering performant, quality experiences to our users and customers.

Adapting to The New Normal in IT Operations

The waves of change are certainly upon us and businesses are being forced to adapt at a record pace. Current world events have caused a jarring shift in all aspects of our lives, accelerating major changes in how we live and work. An unprecedented number of people are now working from home. Those of us working in IT Operations are no exception. Many companies are implementing a Distributed IT Operations Center (D-NOC) approach to address this new reality.

Monitor Apache Flink with Datadog

Apache Flink is an open source framework, written in Java and Scala, for stateful processing of real-time and batch data streams. Flink offers robust libraries and layered APIs for building scalable, event-driven applications for data analytics, data processing, and more. You can run Flink as a standalone cluster or use infrastructure management technologies such as Mesos and Kubernetes.

How Logz io Engineers monitor their multi tenant SaaS offering with Logz io

Logz.io is a Cloud Observability Platform that helps engineering teams quickly identify and resolve production issues using the best open source for metrics and log monitoring available: Elk and Grafana. In this webinar, one of the engineers who built the product, Roi Ravhon, showed how the Logz.io engineering team uses Logz.io to deliver more reliable, performant, and secure services to our customers.

Incident Response in the time of Remote Work

The unexpected and sudden shift to remote working introduces a new set of problems within the incident response space. And while each organization needs to take its own unique circumstances into account, this post outlines the best practices and steps that can be taken in the right direction in keeping operations both productive and proactive.

Modern shadow IT demands visibility, not control

“Shadow IT” can be a divisive subject depending on how long you’ve been in the IT field. There is a legacy attitude within many IT teams that shadow IT must be controlled – but it can bring significant benefits to an organization. Modern IT teams understand these benefits, and focus on balancing shadow IT’s value and risk. Moving past that legacy attitude and developing a modern IT mentality in your organization can be difficult.