Operations | Monitoring | ITSM | DevOps | Cloud

From Manual Requests to SelfServe: Building an AccessControlled App that Adapts Automatically

Platform teams often end up as the bottleneck for “small” operational asks: add a new button, wire up a workflow, expose one more cloud capability—each change requiring engineering time, reviews, and releases. In this technical deep dive, engineers from the Department of Government Services (Victoria) share the architecture and open source CDK library behind their “Infrastructure Control Panel”: a modular operational enablement app that lets non-technical users interact safely with cloud resources through strong access controls.

We Know Before it Breaks: Observability-Driven Development

When stakeholders push for faster growth (new markets, new features, newly modernized stack) your engineering model has to change too. At FitnessPassport, the shift from offshore waterfall delivery to an in-house team meant rebuilding not just services, but confidence: legacy systems with weak logging and little visibility made it hard to know whether changes were working and impossible to spot issues before users did. In this talk, Director of Engineering Rob Mitchell will share how FitnessPassport adopted Datadog and used structured logs, metrics, and traces to tighten feedback loops.

End to End Reliability for all your Workloads

Delivering great products to your customers requires a mix of evolution and consistency. To really land with users your product has to be ready to adapt and scale, prioritizing across a mix of customer and business needs. Join experts in reliability, systems engineering, and DevOps as they share real-world examples, true stories of pitfalls, and astounding impact from the experiments they have run. Learn how experienced practitioners handle failure, adapt to scale, and bridge gaps between teams to improve software performance and customer outcomes.

Performance Testing vs Load Testing: Simple Difference

Learn the clear difference between performance testing and load testing in this quick video. Performance testing checks how well your software works under different conditions like speed, stability, and scalability. Load testing focuses only on how the system handles expected user traffic. If you want to build reliable applications, knowing these two helps you test smarter. Perfect for developers, testers, and QA teams.

When we say "Observability AI Reckoning," what are we actually talking about?

We’ve spent the last decade collecting more telemetry. Now AI is analyzing it. Here’s the catch: AI needs the full dependency chain to reason correctly. If it sees spans but not storage contention… Services but not Kubernetes scheduling… Frontend metrics but not downstream providers… It will confidently optimize the wrong thing. AI doesn’t lower the need for observability. It raises the standard.

Identify Weaknesses like a Ninja

Most IT teams find themselves playing catch up when dealing with vulnerabilities within their IT environments. In this stream, Director of Community, Jonathan Crowe joined by Director of Product Management, Greg Thomas and Sr. Product Marketing Manager, Mark Bermingham, will discuss how to be proactive and build a vulnerability management process that reduces risks and costs. What you’ll learn.