Gemini 3 beaks OpenAI's long-standing lead in SRE tasks

Dec 4, 2025

A major shift just hit SRE-focused AI. Gemini 3 Pro edged out OpenAI’s models and outperformed them across every single SRE task we tested.

In this Rootly AI Labs episode, Sylvain Kalache and Laurence Liang break down:

  • Why Gemini 3 Pro came out on top for compute, storage, and networking actions
  • How the SRE-skills-bench benchmark works and what it actually measures
  • Real examples of SRE-type tasks
  • What these results mean for SREs, platform engineers, and on-call teams in the near future