The latest News and Information on Observabilty for complex systems and related technologies.
Power has a way of flowing towards people managers over time, no matter how many times you repeat “management is not a promotion, it’s a career change.” It’s natural, like water flowing downhill. Managers are privy to performance reviews and other personal information that they need to do their jobs, and they tend to be more practiced communicators.
Businesses today need to react instantly to changes or alerts that impact the digital experience. Full-stack visibility can help.
Cloud native and microservice architectures bring many advantages in terms of performance, scalability, and reliability, but one thing they can also bring is complexity. Having requests move between services can make debugging much more challenging and many of the past rules for monitoring applications don’t work well. This is made even more difficult by the fact that cloud services are inherently ephemeral, with containers constantly being spun up and spun down.
How do you pass context from events that concern Security teams to Development teams who can make changes and address those events? Often this involves a series of meetings and discussion that can take days or weeks to filter down from security event to developer awareness. Compounding the problem, developers generally do not have access to Splunk Core, Cloud or Enterprise indexes used by security teams, and indeed, may use only Splunk Observability for their metrics, traces and even logs.
If you were pulled into a meeting right now and asked to give your thoughts on how to achieve better outcomes with monitoring and observability, what would you recommend? Would you default to suggesting that your team improve Mean Time To Detect (MTTD)? Sure, you might make some improvements in that area, but it turns out that most of the opportunities lie in what comes after your system detects an issue. Let’s examine how to measure improvements in monitoring and observability.
I care a lot about instrumentation and telemetry and OpenTelemetry, so I was thinking of joining the observability engineering team at my company… but it seems like they spend all their time managing Prometheus and Grafana. I guess I was expecting something very different?
Organizations in every industry are becoming increasingly dependent upon data to drive more efficient business processes and a better user experience. As the data collection and preparation processes that support these initiatives grow more complex, the likelihood of failures, performance bottlenecks, and quality issues within data workflows also increases.