Operations | Monitoring | ITSM | DevOps | Cloud

Context is King #5 - Building Safe AI Agents

As AI agents gain more autonomy, safety can't be an afterthought. In this talk from Context is King in London, Jonatan von Martens (AI Safety Engineer at ElevenLabs) shares what it actually takes to build agents that behave reliably in production. Context is King is a meetup series co-organized by Flow AI and Aiven for engineers shipping AI agents in production. No pitches — just real implementation stories.

Context is King #5 - A Semantic Layer for the Agentic Era

Agents are only as good as the queries they can run. In this talk from Context is King in London, Egor Kraev (Co-Founder & CTO of Motley) breaks down how a well-designed semantic layer becomes the connective tissue between natural language intent and reliable data retrieval. Context is King is a meetup series co-organized by Flow AI and Aiven for engineers shipping AI agents in production. No pitches — just real implementation stories.

Context is King #5 - Ontologies as Executable Context for AI Agents

Can a knowledge graph do more than store facts — can it actually run your agent? In this talk from Context is King in London, Teodoro Baldazzi (Principal AI Engineer at Prometheux) makes the case for ontologies as executable context: structured knowledge that doesn't just inform AI agents, but actively shapes how they reason and act. Context is King is a meetup series co-organized by Flow AI and Aiven for engineers shipping AI agents in production. No pitches — just real implementation stories.

Automated Alerting: Stop Losing Money to Delayed Notifications and Inefficient Alerting Workflows

When incidents are not addressed – or not addressed quickly enough – businesses incur significant costs. Mean Time to Resolution (MTTR) increases. In the worst cases, the financial impact extends beyond your organization to customers and partners. Automated alerting reduces response times and notifies the right people when action is needed.

Stop Missing After Hours Calls with SIGNL4 Call Routing

Many teams invest time building an on-call rotation, but inbound calls often ignore that structure completely. A support number forwards to a single phone. One engineer ends up taking every call. Sometimes the call goes unanswered and the voicemail lands in a shared mailbox that nobody checks until the next morning. Even worse, the team might have several engineers on duty, but the phone system has no awareness of who is actually responsible at that moment.

Your Monitoring Stack Wasn't Designed. It Was Procured.

The 2am war room hasn’t gone anywhere. Ten years after Gartner coined the term AIOps, the platforms are bought, the licenses are renewed, the dashboards are live — and serious incidents still get resolved by engineers paging across multiple consoles, trying to work out where the fire actually is. MTTR has barely moved. Alert fatigue hasn’t eased. The outcomes the category promised, in most enterprises, have not arrived. Matt Lowe’s recent article on AIOps names the shortfall well.

How to monitor and optimize GPU utilization in the cloud

GPU utilization is one of the most expensive metrics in cloud infrastructure to get wrong. A GPU running at 30% utilization costs the same as one running at 90%, but it's doing a third of the useful work. For workloads measured in tens of thousands of GPU-hours, the difference between average utilization in the 30s and average utilization in the 70s is hundreds of thousands of dollars across the life of the workload.

How to Troubleshoot High CPU Usage on Network Devices

Most network teams only find out their firewall is overloaded after users start complaining. A slow VPN, dropped calls, and random packet loss at 2 pm every day. The usual suspects get blamed first: the ISP, the switch, the application server. The firewall gets a pass because the dashboard says 40% CPU and everything looks fine. Here is the problem with that picture. Standard SNMP monitoring polls every 5 minutes. A CPU spike that peaks at 95% and recovers within 90 seconds never shows up.

Why Your Agentic Workflow Succeeds and Still Gets It Wrong

Agentic workflows are reshaping how engineering teams operate, fetching context, synthesizing decisions, and shipping results across systems without human intervention. But the same design that makes them powerful adds risk in production. Agents do not crash when they hit bad data; they synthesize around it, substituting a stale value, an empty page, or a missing field for the result they were supposed to capture.