Operations | Monitoring | ITSM | DevOps | Cloud

#049 - The AI Translator: Using LLMs & MCP for K8s Operations & Self-Healing Infra with Alexei Le...

In this episode, Itiel Shwartz kicks off a series on MLOps, LLM, and GenAI in Kubernetes. Starting with Alexei Ledenev, who has over two decades in software development and deep experience in cloud architecture and distributed systems. He shares his journey from CoreOS Fleet to his current role on the Platform Team at Doit.

#048 - Shaping the Future of Software Development with Idan Gazit (GitHub Next)

Meet Idan Gazit from GitHub Next, a team responsible for projects like GitHub Copilot. Gazit, despite jokingly claiming to be "the least knowledgeable about Kubernetes," shares his diverse career journey, spanning from early web development with Perl and Django to his time at Heroku and eventually GitHub. He discusses his team's role in prototyping future software development solutions, emphasizing the importance of identifying and nurturing risky, impactful ideas for developers, even if it means "killing projects" that don't gain traction.

Product Klip: Komodor's Advanced Cost Optimization Capabilities

This Product Klip covers Komodor's cost optimization features, highlighting how the platform helps users reduce Kubernetes spending while maintaining operational stability. Key features discussed include: Before activation, Komodor provides simulations of potential savings, and for activated clusters, it shows CPU and memory usage before and after Komodor's bin packing and the resulting dollar savings. Komodor enhances existing autoscalers rather than replacing them, unlocking up to 40% in additional savings.

#047 - Securing the Software Supply Chain and Kubernetes with Dustin Kirkland (Chainguard)

Meet Dustin Kirkland, VP of Engineering at Chainguard. Dustin shares his fascinating 26-year journey in the tech industry, from IBM and two stints at Canonical to roles at Google (working on GKE), Apex, and Goldman Sachs, eventually leading him back to engineering at Chainguard.

Product Klip: Istio Developer Dashboard

Troubleshooting issues in a complex service mesh environment, such as traffic failures or authorization problems, often requires the expertise of an SRE or DevOps professional. However, Komodor simplifies this process. Komodor provides developers with the necessary visibility to diagnose service mesh issues on their own. It helps developers easily identify blocked connections and understand the root cause without having to review logs or configuration files.

#046 - Simulating, Scheduling, and Saving: Optimizing Kubernetes with David Morrison (Applied Res...

In this episode, Itiel has an insightful conversation with Dr. David Morrison, a research scientist and founder specializing in Kubernetes scheduling and autoscaling. David shares his journey from operations research to leading distributed systems efforts at tech giants like Yelp and Airbnb. Learn about the transition from Apache Mesos to Kubernetes at Yelp, including the role of their open-source API layer, Pasta.

Kubernetes Costs: More Than Meets The Eye

As organizations expand their Kubernetes deployments and scale production workloads, effective cost management becomes an essential priority. The rapid innovation demanded from development teams often intersects with a shortage of advanced Kubernetes expertise, leading to resource inefficiencies and unnecessary expenses. This challenge is further amplified by the growing prevalence of AI/ML workloads and the intricate demands of GPU utilization.

#045 - Beyond Cluster Creation: Mastering Multi-Cluster Kubernetes with Gianluca Mardente (Cisco)

Join Itiel as he chats with Gianluca Mardente, a Principal Engineer at Cisco Systems. Gianluca shares his path to tech and Kubernetes, including his work history and the inspiration behind his open-source project, Sveltos. They dive into the significant challenges of managing a large fleet of Kubernetes clusters – ensuring consistency, handling upgrades, and coordinating resources across different clusters.

#044 - Scaling Platforms and Pioneering AI Agents with Hasith Kalpage (Outshift by Cisco)

Join us as Hasith Kalpag, Head of Platform Engineering at Outshift by Cisco, shares his fascinating journey. Hear about his experience leading the massive WebEx transformation to cloud-native using Kubernetes, including the intense push during the COVID-19 response, where they went from zero to over 50 production clusters in just three months.