Operations | Monitoring | ITSM | DevOps | Cloud

Sponsored Post

The Product Manager's Nightmare: Seeing Features Too Late

Sarah stared at her laptop screen in disbelief. The feature her team had been building for three weeks was finally deployed to staging, and it looked nothing like what she had envisioned. The user interface was cramped, the workflow felt clunky, and the color scheme clashed with their brand guidelines. "Can we change the button placement?" she asked during the demo. "That'll require refactoring the entire component structure," replied the lead developer. "It's probably a two-day task now." What should have been a simple adjustment had become a major undertaking.

Part 2: Building a Production-Grade Traffic Capture, Transform and Replay System

When developers try to build realistic mocks and automated tests from production network traffic, the real challenge isn’t just in the capturing—it’s in the data manipulation. Raw traffic is a chaotic sea of patterns, dynamic tokens, environment-specific secrets, and tangled dependencies that seem impossible to untangle by hand. Over my two decades of building these sytems, I learned that solving this problem requires more than brute-force parsing or ad hoc scripts.

dotConnect for QuickBooks Online | Secure C# ADO.NET connection with ORM support

Connect your C# .NET applications directly to QuickBooks Online with dotConnect for QuickBooks — a robust, high-performance data provider designed for modern financial integration. Whether you’re building financial dashboards, automating reports, or syncing accounting data, dotConnect gives you seamless, secure access to QuickBooks Online with full Visual Studio integration and rich ORM capabilities.

Webinar Recap: 3 Cost Allocation Mistakes FinOps Teams Can Avoid

In a webinar hosted by CloudZero on Oct. 30, 2025, Larry Advey, Director of Cloud Platform and FinOps and a respected voice in the FinOps community, joined Umesh Rao to deliver a practical session on cloud cost allocation. The session, titled Three Allocation Mistakes Most FinOps Teams Make, unpacked hard-earned lessons and offered a guided tour of CloudZero’s new Dimension Studio.

AWS Fargate Alternatives: Comparing Serverless Container Options

Imagine you have an API service composed of multiple microservices. Traffic fluctuates — sometimes light, sometimes spiking. Without Fargate, you’d have to manage EC2 instances, autoscaling, patching, and more. With Fargate, you define each microservice as a task, setting the CPU/memory, container image, network rules, and AWS schedules, and then run them as needed. The result: faster deployment, lower ops overhead, and smooth scaling.

Automating your synthetic test infrastructure with Datadog Synthetic Monitoring and Terraform

Testing ecosystems contain massive amounts of data, including outlined test scenarios, prerequisite configurations, and the tests themselves. As a result, these ecosystems are prone to data sprawl. This makes it difficult to prevent configuration drift and quickly spin up new tests, especially at the frequency needed to support a fast-growing application. Teams can handle these challenges by treating their tests as part of their application infrastructure.

Puppet Edge Across Lifecycle: Day 0, Day 1, and Day 2

Puppet Edge extends your Puppet automation to now include all your network devices, providing a centralized platform to manage your entire infrastructure, enabling teams to work together efficiently. Automate tasks, manage configurations, and ensure compliance across all your devices from one place. This video discusses how to use Puppet Edge for Day 0 planning and provisioning, Day 1 device configuration, and Day 2 ongoing operations, streamlining your workflows, and reducing manual errors.

Streamline feature management with Harness MCP and Claude Code

Harness now supports the Model Context Protocol (MCP) for Feature Management and Experimentation (FME), enabling developers to interact with feature flags directly from AI-powered IDEs like Claude Code and Windsurf. The FME MCP tools make it easier to explore, understand, and manage feature flags through natural language, streamlining delivery and release workflows without leaving your development environment.

Validating chaos experiments with GCP Cloud Monitoring probes

GCP Cloud Monitoring probe let you transform your existing GCP metrics into automated pass/fail validation for chaos experiments, eliminating subjective observation in favor of objective measurement. With flexible authentication options (workload identity or service account keys) and PromQL query support, you can validate infrastructure performance against defined thresholds during controlled failure scenarios.

The Right Way to Deliver Infrastructure: Every Deploy Comes with Guardrails

In fast-moving organizations, developers are expected to ship quickly. Infrastructure shouldn’t be a blocker, but it can’t become a liability either. One unchecked terraform apply, a missing tag, or a misconfigured instance can turn into a surprise bill, a failed audit, or even a production outage. The most reliable way to manage infrastructure at speed is to make governance part of the delivery process.

The sovereignty of the builder: Lessons from Civo Navigate London 2025

Digital sovereignty isn’t won in policy papers. It’s earned in production. That was the challenge issued by Civo CEO Mark Boost and Board Director Kelsey Hightower at Civo Navigate London 2025. They argued that the cloud's real failure lies not with the providers, but with the customers who refused to change. Catch up on the full fireside chat below The power shift is underway, moving from large vendors back to the practitioner.

AWS Outage Shows Why UK Businesses Can't Afford Single-Cloud Dependency

The impact of the AWS outage has reminded many businesses of the risk for businesses that rely heavily on centralised cloud infrastructure, especially when so many essential services are concentrated in a single region. But at the wider, industry level, this is also a warning around the widespread lack of contingency planning for cloud failures. Reactive response must give way to strategically planned disaster recovery protocols that engenders a resilient cloud market.

Decoding cloud credits: Are "free" credits locking you in?

“Free" cloud credits, they sound like a gift, but they often come with hidden costs and an agenda: lock-in. The illusion of a cost-saving measure can quickly become a vendor-specific trap, forcing costly migrations or leaving your business overpaying for cloud services. This issue, which the UK Competition and Markets Authority (CMA) estimates contributes to £430M of annual over-payments in the UK alone, is what we call the "cloud credit trap.".

Don't Fear the Ticket Reaper: How IT Service Desk Automation Slays Everyday Monsters

Here there be cybersecurity monsters. But there don’t have to be. The scariest creatures aren’t in haunted houses or slasher films. They’re in your IT environment, lurking in queues, clogging inboxes, haunting your service desk. We’re talking, of course, about tickets. Every IT leader knows the horror movie I’m referring to: endless tickets piling up and making even the simplest requests a real slog.

Manage your Pipelines usage with the new billing panel

Until recently, the Pipelines Billing Panel has only displayed the total pipeline minutes used across your workspace. You didn’t have visibility into how usage was distributed across your repositories. We’ve now enhanced the billing panel to show you build minute usage by repository for the current and previous billing periods so you can identify and manage high-usage repositories.

The Load Testing Start Guide! #speedscale #stresstest #loadtesting #mocking #startup

Are you ready to get serious about load and stress testing, but don't know where to start? This guide highlights the trap most serious engineers fall into: trying to build a custom DIY testing environment. The traditional path means signing your team up for maintaining load drivers, test case frameworks, ephemeral environments, and endless custom mocks a massive drain on time and resources. There's a better, cheaper, and faster solution: Traffic Replay.

Introducing Braintrust

How do you balance speed and quality when your customers are counting on you? What does it actually take to build a culture of reliability? And how should engineering leaders think about AI when it's changing everything about how we ship software? On Braintrust, Ganesh Datta, CTO and Co-founder of Cortex, sits down with CTOs, VPs of Engineering, and technical leaders who've been in the trenches—the ones who've made the hard calls, shipped through the chaos, and built teams that actually work.

Getting started with Site24x7 alert management

Struggling with alert overload or missed notifications? Learn how Site24x7 helps you manage alerts effectively, from setting thresholds and tracking key metrics to routing notifications, automating actions, and leveraging AI-powered Zia thresholds. Follow a real-world DevOps scenario to see how your team can respond faster, smarter, and more efficiently.

Tracking Aborted Queries and Memory Grants in Redgate Monitor

Redgate Monitor now surfaces two common SQL Server query issues that usually take manual work to uncover: cancelled or aborted queries and high memory-grant queries. You can now see both in the Query Executions view for each SQL Server instance, directly alongside server activity and alerts, so you can diagnose the cause much faster. Recently, Redgate Monitor introduced the Query Executions feature for SQL Server instances, using Extended Events to capture execution details for individual queries.

Enforce type safety with TypeScript checks before deployments

TypeScript introduces the benefits of static typing to JavaScript, allowing developers to identify bugs at an earlier stage. However, relying solely on developers to run type checks locally isn’t enough. Without tsc being called, a person can just leave the invalid code and it may pass to production. This tutorial will show you how to set up CircleCI to automatically run the TypeScript type checks on each push.

Rise of the Neocloud: Top 10 Providers and Adoption Trends

With the tech world once again in the midst of a major shift, a new group of contenders has entered the market, offering capabilities the traditional hyperscalers can’t match. Often dubbed "neoclouds", these new providers range from developer-focused platforms to GPU-centric infrastructure clouds. Can these new clouds compete with 100 billion dollar incumbents in a space that's already changing at breakneck speed?

Migrating from Librato to Hosted Graphite on Heroku - Full Tutorial

Librato on Heroku is being sunsetted, so what's next? In this tutorial, we walk through: Why Hosted Graphite by MetricFire is the best upgrade from Librato on Heroku Step-by-step migration: move your Heroku dyno, router, Postgres, Redis & custom metrics into Hosted Graphite A side-by-side comparison: metrics ingestion, dashboards, alerts, and integrations.

Transform your DevSecOps with Harness AI and Google Cloud

Teams have always been under pressure to deliver software faster. But here's what we've learned from working with thousands of engineering teams: writing the code has never been the real bottleneck. It's everything that happens after - the testing, security scans, deployments, and optimizations that determine whether your innovations actually reach customers quickly and reliably. Even in the era of AI, the speed boost is uneven, creating the AI Velocity Paradox.

Civo AI: Turnkey Privacy, Design & Solving Real Problems

Josh Mesout, Civo’s Chief Innovation Officer, takes the stage to ask the question every UK leader is facing: how do we keep pace with the AI explosion while retaining full data sovereignty? Josh walks through relaxAI – a privacy‑first, turn‑key AI solution that ships as a plug‑and‑play box, runs on the latest NVIDIA B200 hardware, and keeps every byte inside the UK. He also demonstrates the dramatic cost advantage, up to 90 % cheaper than the big‑name models , and shows how a single query can spin into a research‑grade analyst report in seconds.

Webinar: Platform Engineering 3.0-Practical Guidance for the AI Era

Platform Engineering is evolving fast. From the early days of CI/CD automation to today’s self-service platforms, we’re entering a new era: AI-enhanced infrastructure management.‍ In this 40-minute thought leadership session, you’ll gain practical, actionable guidance on how to prepare your team for Platform Engineering 3.0—where AI-driven insights, automated governance, and IDE-native workflows accelerate delivery without sacrificing control.

Decoding cloud credits: Are "free" credits locking you in?

Are "free" cloud credits holding your business back? Join us for an online webinar as we explore the hidden risks associated with cloud credits and the impact they have on businesses. Our speakers, Simon Hansford, Chief Commercial Officer at Civo, and James Marks, Founder of Canopy, will share their knowledge and experience on this critical topic. This webinar is based on our recent whitepaper, which examines the "cloud credit trap" and its far-reaching implications for organizations.

Why we brought hardware-optimized GenAI inference to Ubuntu

On October 23rd, we announced the beta availability of silicon-optimized AI models in Ubuntu. Developers can locally install DeepSeek R1 and Qwen 2.5 VL with a single command, benefiting from maximized hardware performance and automated dependency management. Application developers can access the local API of a quantized generative AI (GenAI) model with runtime optimizations for efficient performance on their CPU, GPU, or NPU.

Android COSU: Unlocking Efficiency for Business Device Management

Struggling to lock down your Android fleet? This video breaks down Android COSU (Corporate-Owned, Single-Use) mode! Learn what COSU is, why industries like retail, healthcare, and logistics use dedicated devices (kiosks, tracking terminals, etc.), and how Kiosk Mode makes it all happen. We dive into the key features powered by UEM/MDM solutions—from App Management and Device Restrictions to essential Security Features. If you need to manage a fleet of single-purpose Android devices reliably, this overview is for you!

Cortex and Sonar Announce Partnership to Drive Code Quality and Security at Scale

At Cortex, our mission is to empower engineering organizations to ship reliable, secure, and efficient software, faster. Today, we’re thrilled to announce a formalized partnership with Sonar, the leader in code quality and security. For years, Sonar has been one of the most popular integrations in Cortex. Teams rely on Sonar’s deep code insights to identify vulnerabilities, ensure coverage, and raise the bar for clean, secure code.

Running Ansible Playbooks from Puppet Edge

When thinking about imperative infrastructure commands and Day 0 tasks for provisioning infrastructure, Ansible is an oft-mentioned tool that has been popular among practitioners for its easy YAML syntax and agentless architecture. You might have used Ansible to get your infrastructure started or for other “one-and-done” infrastructure automation scenarios.

FinOps for Hybrid IT: Extending Visibility Beyond the Cloud

Controlling IT spend used to mean managing cloud invoices. Today, it’s far more complex. Modern enterprises run workloads across multiple platforms — cloud, virtualized, and on-premises — each with its own cost structures and dependencies. That’s why FinOps for hybrid IT has become essential. Extending FinOps principles beyond cloud services enables organizations to see how every part of the infrastructure contributes to cost, efficiency, and business value.

Jira Service Management (JSM) Review for Alerting (2025)

Atlassian is shutting down OpsGenie. New sales stopped on June 4, 2025, and the platform will be completely offline by April 5, 2027. As an OpsGenie user, you now face a critical decision: Migrate to Jira Service Management (JSM), Atlassian’s recommended path, or choose a different solution. And if you’re not sure JSM is the right fit for your team’s alerting needs, this review will help you decide. I signed up for JSM and put it through real-world testing.

Sidecar or Agent for OpenTelemetry: How to Decide

Getting telemetry out of a distributed system isn’t the hard part. Getting it out cleanly, without noise, drop-offs, or odd performance side-effects — that’s where things get interesting. Before you worry about processors or storage costs, you need a clear plan for where the OTel Collector should run. Most teams narrow this down to two options: a sidecar that sits next to each service, or a node-level agent that handles data for everything running on the node. Both patterns are solid.

Setting Up the GitKraken MCP Server with GitLens

GitKraken MCP (Model Context Protocol) brings repository intelligence directly into VS Code, Cursor, Windsurf, and other AI-powered IDEs so your agent stops guessing and starts understanding your actual workflow. Instead of manually explaining your branch structure or digging through issues in your browser, MCP connects your AI agent to the actual state of your repositories. It understands your branches, your issue tracker, your pull requests, and your commit history—then helps you start work, resolve conflicts, and review code without leaving your editor.

Sovereignty over silence: Why Microsoft's data opacity is the real lock-in

The refusal by Microsoft to detail data flows to Police Scotland confirms the real price of hyperscale: control is an illusion. This incident isn't the problem. It’s the proof. It proves the need for a new standard in cloud computing, one that prioritizes true digital sovereignty and architectural transparency. Sovereignty, after all, is all about the customer being able to exercise control over the IT resources they use.

RancherLive: Know before you Go - KubeCon Atlanta Edition

Join us for a special Know Before You Go online session all about getting ready for KubeCon + CloudNativeCon in Atlanta! Hosted by Orlin Vasilev, this live event will feature special guest Nick Eberts — a proud Atlantan, musician, father, and Project Manager at Google. Nick will share insider tips on making the most of your KubeCon experience — from navigating the conference to exploring the best of Atlanta’s music, food, and culture. Whether it’s your first time at KubeCon or your first visit to the city, this session will help you feel right at home.

Building a Pregnancy App You Can Actually Trust!

We never talk about pregnancy in the workplace. Maybe it's time to change that. Tech is still male-dominated, which creates a ripple effect: poor maternity policies, overworked expecting developers, and privacy-invasive apps that fail the people who need them most. But what happens when a developer decides to solve this problem themselves? Rizel built a pregnancy app she could actually trust. Not because existing solutions didn't exist, but because they weren't built with real privacy, real needs, and real developer insight in mind.

Open Source Cloud Orchestration Tools Compared

Before 2011, cloud infrastructure was still new. AWS had launched EC2 and S3 in 2006. But to deploy applications, engineers had to manually spin up servers, configure storage, and set up networking — all by hand or with custom scripts. There were early configuration management tools, such as Chef and Puppet, but those didn’t offer full cloud orchestration. Then in 2011, AWS launched AWS CloudFormation as the first major orchestration tool.

CEO Diaries: Not All AI Talent Is Alike

If Meta’s (now halted) nine-figure AI talent poaching scheme was any indication, the AI talent market is pretty frothy. The number of AI-related job postings has roughly tripled since 2019, and the average salary has more than doubled (Bain). The race is on for companies to find the fastest, most sustainable routes to AI-driven business value; all companies, but especially software companies, are hotly pursuing racers. But despite what Zuckerberg & Co.

Azure Cost Optimization: Best Practices for Cloud Solution Providers

In this episode, we explore practical Azure cost management strategies tailored for Cloud Solution Providers (CSPs). The conversation dives into cost visibility, optimization techniques, and billing transparency, helping CSPs improve margins and deliver more value to their customers. Featuring experts from West Coast, a leading CSP, including James Reed (Azure Sales Manager) and Mitchell G. (Azure Sales Specialist), along with Mike Stevenson, the discussion highlights real-world insights from the partner ecosystem.

Introducing JFrog Fly: The World's First Agentic Artifact Repository

AI has created a paradigm shift in software development. AI-native development teams – from small startups to enterprises like Goldman Sachs and Google – are adopting agentic development tools like Cursor and Copilot to increase the speed of code generation to a pace we’ve never seen before. But with all this new code comes a big challenge: how do you manage all these potential new releases and get the right ones deployed?

Navigating the geopolitical maze of digital sovereignty at Civo Navigate London 2025

Trust in Big Tech is eroding. Geopolitical tensions are rising. The only predictable thing about the cloud today is that it’s time to re-evaluate everything. At Civo Navigate London 2025, we pulled together a panel of industry experts to cut through the noise and finally define what digital sovereignty means for the UK.

OTel Updates: Consistent Probability Sampling Fixes Fragmented Traces

You're sampling 1% of traces in production. A payment request fails at 3 AM. Logs show an error in order-service, but the full picture isn't there because different services made different sampling decisions. order-service kept the trace; payment-service didn't. So you end up checking logs and timestamps across a few services to piece things together. This happens because the usual probability sampling approach makes a separate choice at each service boundary.

Optimizing What Matters Most: The State of Mission-Critical Work | Research Insights from Mattermost

How do top-performing organizations ensure their most essential workflows stay effective, efficient, and secure in a quickly evolving environment? Find out in this new research report from Mattermost and the Ponemon Institute: https://mattermost.com/mission-critical-work/

Comparing Ways to Connect to AWS

Not sure how to connect to this leading cloud provider? Compare options to pick the best one for your business. Curious about Amazon Web Services (AWS) and the best ways to connect? AWS is a hybrid cloud provider with customized, scalable, cloud-based packages. These encompass: Whether you’re part of a multinational corporation or a small startup, you can choose among various AWS services to meet your needs.

One Box, One Region: Private Cloud Ready on Demand

We’ve built a private‑cloud that feels exactly like a public‑cloud region, as Civo CTO Dinesh Majrekar puts it, it all starts with a conversation. Tell us what you need, we order the hardware, it ships, you plug it in, and the cloud is live in no time. No sprawling projects, no endless paperwork, just a really‑really simple flow that turns a handful of commodity servers into a sovereign Civo region.

Activating free trial in dbForge add-ins for SSMS

In this video, you’ll learn how to activate a dbForge add-in for Microsoft SQL Server Management Studio. We’ll use dbForge SQL Complete as an example, but the activation steps apply to all dbForge add-ins. We’ll show where to find your activation key, how to activate online or offline, and even via the command line. Activation is quick, secure, and free — so you can start your trial right away!

What is Content Addressable Storage?

Imagine a world where every change in your systems from a config tweak to a deployment carries its own cryptographic proof. No forms. No meetings. Just mathematical truth. In this video, Mike Long (CEO & Co-Founder, Kosli) explains how cryptographic fingerprints like SHA-256 are used to create unique identities for files, code, and configurations — and how Kosli uses this approach to continuously track changes across servers, Kubernetes clusters, and cloud environments.

SLA, SLO, and SLI: Understanding the Foundations of Service Reliability

Last week, I ordered a pizza on a food delivery app. And they promised the delivery in 30 minutes. Similarly, all digital services: Apps, websites, cloud platforms, etc, make promises about speed, uptime, and reliability. The difference is how they track and measure those promises. That’s where SLA, SLO, and SLI come in. These three metrics define what “reliable” actually means. They turn a vague claim like “99.9% uptime” into something you can measure, track, and act on.

Monitoring Chaos Experiments with New Relic Probe in Harness

New Relic probes in Harness Chaos Engineering let you automatically validate system performance against defined SLOs during chaos experiments, transforming subjective testing into objective, metrics-driven resilience validation. By querying New Relic metrics in real-time and comparing results against your success criteria, you can programmatically verify that your systems maintain acceptable performance levels even under failure conditions.

Redgate Flyway Enterprise's code analysis: Enforce compliance, reduce risk, deploy with confidence

With increasing security threats and stringent compliance requirements, database code quality isn’t just a best practice; it’s a business imperative. Yet many organizations struggle to enforce their database development standards consistently across teams, leading to security vulnerabilities, potential data loss, and lengthy review cycles that slow down software delivery.

The CFO's Guide To Accurate Cost Allocation

Every finance team knows this pain. The cloud bill continues to grow, but the numbers don’t quite add up. Engineering swears their usage reports are accurate. But when you ask, “Which product or customer drove last month’s 18% cost increase?” things go quiet. That silence usually means one thing. Inaccurate cost allocation. Native cloud cost allocation tools from AWS, Azure, and Google Cloud can help. However, they often stop at the average this or average that layer.

How FOCUS Is Shaping The Next Era Of Cloud Cost Optimization

SaaS, AI, and technology spending today looks like a more intense version of how it was a decade ago when everyone first migrated to the cloud. The mentality was and, at some companies, still is to build build build and worry about controlling expenditures and optimizing costs later. We’re seeing astronomical amounts of money being raised by brand-new AI businesses that barely even existed a couple of years ago. More than $145 billion has been raised for U.S.

Integrate CircleCI with Railway for automated deployments

The speed and reliability of deploying backend and full-stack applications are usually a concern for development teams. Fortunately, Railway is a developer-friendly platform that allows you to deploy apps with limited configuration. It is also quick, easy to use, and has reasonable defaults. Now, imagine pairing that with CircleCI, one of the strongest continuous integration platforms available.

Black Friday is 30 days away. Your engineering infrastructure might not be ready

If you're anything like your peers, you probably blinked in April and found yourself a month away from Black Friday when you opened your eyes. Much like a shopper desperately scrambling to pull together gift lists for their loved ones, many engineering teams find themselves rushing to ensure their systems can handle the biggest shopping day of the year.

Why Your Next ITSM Agent Won't Be Human (and Why That's a Good Thing)

For a very long time now, IT leaders have relied on the “throw more bodies at it” strategy: when ticket volumes rise, headcount follows. That model no longer works. Hybrid work, SaaS sprawl, and cloud complexity have made human-only scaling unsustainable. The enterprises winning today aren’t scaling with headcount. They’re scaling with autonomous ITSM agents: AI-driven specialists that resolve tickets instantly, escalate only when needed, and keep operations running 24/7.
Sponsored Post

Avantra + Ansible: Better Together for Enterprise SAP Automation

Enterprises trust Ansible for fast, reliable infrastructure automation, including terraform for automated cloud provisioning. Many organizations using Ansible leverage Ansible SAP playbooks for SAP infrastructure automation. Avantra extends the scope of SAP operations using Ansible, adding observability, ITSM and ALM solution integration, and orchestration across the SAP estate. Avantra and Ansible together provide a closed-loop solution where monitoring, automation and proof of outcome live in one place across on-premise, hyperscaler and private cloud ERP implementations.

Settle Your QA Debt Before the Bugs Start Breaking Kneecaps

In Part One, we discussed how QA debt builds silently over time — causing slower releases, late-night firefights, and unpredictable test cycles. The next step is understanding how much debt you have and where it hides. This post goes deeper into measuring QA debt — what to track, how to collect data, and how to use those insights to create a sustainable plan for improvement.

Disaster Recovery: Everything You Need to Know

With increasing cyberattacks and cloud outages, maintaining system resilience is critical. A robust Disaster Recovery (DR) strategy enables teams to prepare for unexpected events. It makes sure they can recover critical systems and data with minimal disruption. This blog will cover what disaster recovery is, why it matters, and the key components of an effective Disaster Recovery Plan. We’ll also walk through the steps for creating your own strategy.

How Git Worktrees Fix Context Switching in Your Workflow

Git worktrees let developers work on multiple branches simultaneously without stashing or losing context. Learn how to use Git worktrees to handle parallel development, urgent hotfixes, and code reviews without the single-threaded pain of constant branch switching. In this tutorial, GitKraken Senior Product Director Justin Roberts shows you how to set up Git worktrees using both the command line and GitKraken Desktop. You'll see real-world examples including handling critical bugs, reviewing pull requests without disrupting your current work, and managing multiple feature branches at once.

Validate CDC data in your CI/CD pipeline using CircleCI

Change Data Capture (CDC) is a technique used to identify and capture changes, such as inserts, updates, and deletes, in a source database so they can be replicated to another system in real-time. This approach is crucial in modern data pipelines, especially for powering data lakes, analytics platforms, and event-driven applications that depend on up-to-date information. Setting up a CDC pipeline is only the first step.

Data Sovereignty + AI: How Civo's New Cloud Gives You Full Control

We spot the tipping point: AI is reshaping every industry and data has become the new oil. From law firms to pharma labs, SaaS stacks to even a secret chicken recipe, today’s most valuable info is trapped in a handful of hyperscalers with almost no governance. Our UK‑based survey of 1,000 IT leaders shows the pressure is real. Most are ready to abandon big‑tech for true control and sovereignty. Civo’s answer is a new kind of cloud: built from the ground up, 100 % data‑sovereignty, and designed for more value, simplicity, and flexibility.

BygoneSSL and the certificate that wouldn't die

Turns out the scariest thing about SSL certificates isn’t when they expire. It’s when they don’t. I wrote about the CA/Browser fight that led to the 47-day certificate mandate. CAs crying about lost revenue, browsers flexing their root program authority, enterprises stuck in the middle. But nobody talks about the security research that started it all: BygoneSSL at DEFCON 2018. Two researchers mining Certificate Transparency logs found something surprising.

Automate CockroachDB Schema Changes with Harness Database DevOps

Harness Database DevOps now supports CockroachDB, bringing CI/CD automation to distributed SQL databases. Teams can manage schema changes through Git-driven workflows for consistency, traceability, and rollback safety. This integration simplifies multi-environment deployments, reduces human error, and accelerates database delivery.

Chargeback Vs. Showback: Choosing The Right Cost Allocation Model For FinOps

One of the most challenging aspects of running a SaaS business is dealing with highly variable costs. Every service you provide needs resources, and the demand for those resources fluctuates. Just when you think you’ve got this quarter’s budget nailed down, an engineer makes a restructuring decision or a new feature launch exceeds expectations (or falls flat). Suddenly, your bill looks totally different than what you were expecting.

Shadow AI on Trial: The Phantom Threat to Compliance

Every law firm I meet can explain its information security policy in minutes. Far fewer can tell me which AI tools their staff actually used last week, and what data those tools touched. That gap is where Shadow AI sits, such as unsanctioned, unmonitored use of generative AI slips in. It promises speed, but it quietly creates exposure: confidentiality breaches, weak auditability, and a risk to governance when the regulator (or a client’s GC) asks hard questions.

AI Chief of Staff for Engineering Leaders

Meet Cortex’s AI Chief of Staff, your new strategic partner for engineering leadership. This isn’t just another dashboard. It’s an intelligent advisor that connects your engineering data, scorecards, and DORA metrics to give you real-time, actionable insights. What you'll learn in this video: Cortex’s AI Chief of Staff brings clarity and confidence to decision-making, helping leaders transform meetings into strategic, data-driven conversations that drive continuous improvement.

How Cortex brings visibility, governance, and self-service to Kubernetes operations

Platform teams handle cluster provisioning manually because it requires oversight. A product team needs a cluster, so they submit a request. Someone reviews the configuration, provisions it through the cloud provider, waits for it to spin up, and sends back access details. Two days later, the cluster is ready. The delay is intentional. Without review, teams provision oversized clusters that blow budgets, misconfigure networking, or deploy in the wrong regions.

The hidden costs of "free" cloud credits: A wake-up call for businesses

To read the full findings from this research, visit our whitepaper "Decoding Cloud Credits" by clicking here. The allure of "free" cloud credits can be tempting, but beneath the surface lies a complex web of risks and consequences that can ultimately lock businesses into costly and restrictive cloud ecosystems. Our latest whitepaper, "Decoding Cloud Credits", explores the true costs of these promotional offers and the implications for businesses.

Integration & Data Ingestion: Strengthening AIOps Observability

Large enterprises face the challenge of managing high-volume, very diverse data streams that span both legacy and modern, digital systems and applications. To gain timely, accurate insight across this kind of complexity, IT teams need observability platforms that can do more than just monitor - they must also unify, contextualize and enrich data so teams can act effectively to protect the availability of the services their customers rely on.

EKS Pricing And Cost Optimization (2025 Guide)

AWS did not intend to build Amazon EKS; it simply had to. Kubernetes adoption beamed light years ahead of AWS’s own managed container orchestration service. This forced AWS to develop a managed service to accommodate customers who wanted to use upstream Kubernetes but did not want to manage it themselves. As soon as AWS got around to it, it knocked the Kubernetes-based container management service out of the park. Not only is Amazon EKS simpler than Kubernetes, but EKS pricing may also be worth it.

What we learnt from our panel discussion on AI in the UK

At Civo Navigate London 2025, we hosted a panel discussion with Josh Mesout, James Faure, Abdul Hummaida, Jonas Vermeulen, and Daniel Miodovnik to discuss the latest trends and challenges in AI adoption. Through this conversation, the panelists covered topics such as the current state of AI adoption to the challenges of scaling AI, and the future of work.

OpenTelemetry Spans Explained: Deconstructing Distributed Tracing

In a microservices architecture, a single user request can pass through multiple services before completing. When performance drops or an error occurs, tracing that journey is the only way to locate the source. Distributed tracing provides that visibility. At its core are OpenTelemetry Spans — units of work that capture what each service does during a request.

The Power of JFrog Artifactory as Your Model Registry

In my previous blog, we demonstrated how the FrogML SDK streamlines the process of integrating custom-built or publicly sourced models from your IDE into JFrog Artifactory. Now that your models are securely stored, versioned, and managed, the natural next question arises: “Ok, so you have some models in JFrog Artifactory, now what?” This is where the real power of the JFrog Platform comes into play.

What is AIOps and What Happens When IT Runs Itself?

The scale of modern IT is outpacing human capacity. Microservices, multi-cloud deployments, and the Internet of Things (IoT) have created a complex IT ecosystem, generating an exponential volume of operational data. Traditional operations teams, regardless of their skill, struggle to keep up. Because of this, forward-looking leaders are adopting AIOps in IT, not as an upgrade, but as a foundational shift.

Stop Debugging Blindly! How Traffic Capture Can Help Your Code #speedscale #trafficcapture #ai

Is AI "slop" or new code pushing tons of bugs into production? You can't test everything forever. Learn how traffic capture is the most efficient way to understand how your code is actually running in the real world. By grabbing data from sidecars, packet captures, or logs, you get the context you need to prevent bugs and improve performance.

New Feature Friday: Ask Your Engineering Metrics Anything | Cortex Engineering Intelligence for MCP

What if you could analyze your engineering metrics — just by asking a question? Now you can. Introducing Engineering Intelligence for MCP, a new Cortex feature that lets you query your engineering data through AI systems like Claude or Cursor. No dashboards. No BI setup. Just ask: “What’s driving our cycle time?” “Show me Q3 deployment trends.” “Where are our bottlenecks?” Cortex connects your real engineering data to your favorite AI client, so you can get instant, conversational insights on delivery, quality, and team performance — all without leaving your IDE.

DevOps & Observability for Digital Catalogs: faster releases, fewer outages

Digital catalogs have become a core sales engine, not just a glossy PDF on a server. They power discovery, merchandising, and conversion across web and mobile experiences. When a catalog powers real revenue, the way you build and run it starts to look a lot like modern software delivery. That's where DevOps and observability enter the picture: practices that shorten release cycles, reduce risk, and keep customer experiences fast and available even on your biggest traffic days.

What Is Incident Response Lifecycle?

The Incident Response Lifecycle is a step-by-step process that helps engineering teams detect, respond to, and recover from unexpected system disruptions or outages. It includes a series of six practical stages: Detection, Analysis, Impact Mitigation, Incident Resolution, Service Restoration, and Post-Incident Analysis. By following this lifecycle, teams can minimize downtime, reduce business impact, and continuously strengthen system reliability.

Why your Kubernetes clusters and GPUs should live under one roof

The world remains abuzz with AI hype, but the reality is that most modern applications aren’t purely AI workloads. The average company will have web services, APIs, databases, and background jobs running alongside its machine learning inference or training components. An architecture question everyone faces: should your Kubernetes cluster and GPU compute live in the same data center, or can you split them across providers?

How to manage ilert call flows via Terraform

Call flows let you design voice workflows with nodes like “Audio message,” “Support hours,” “Voicemail,” “Route call,” and much more. The ilert Terraform provider now includes a ilert_call_flow resource so you can version and promote these flows across environments. This blog post offers an overview of managing call flows in Terraform, detailing the benefits and key scenarios.

A quick recap of IDPCON 2025

Two weeks ago, we hosted IDPCON 2025, and the response has been overwhelming. Over 250 engineering leaders from 20+ countries joined us for 12 sessions featuring speakers from Canva, Skyscanner, Blackstone, and more. Attendees participated in discussions at 20+ roundtables, sharing strategies and challenges around engineering excellence and internal developer portals.

Cultural ROI In FinOps: People Drive Pivots

When I ask clients to picture cloud cost optimization, they think dashboards, policies, maybe a clever right-sizing purchase. What they don’t picture? Meetings. Misunderstandings. Mistrust. To avoid FinOps failures, we need a new starting line; one that gets to the root of spend misalignment.

From Code To Clicks: A Visual Way To Build Dimensions In CloudZero

In early October, we launched Dimension Studio, a new visual editor for engineers and others that brings point-and-click simplicity to the same powerful, precise allocation engine CloudZero is known for. Before that, when CloudZero users built cloud cost allocations, they got it from our YAML-based CostFormation engine, a code-driven way to describe how cloud and AI costs roll up to products, customers, or teams.

Announcing HAProxy ALOHA 17.5

HAProxy ALOHA 17.5 is now available. This release delivers powerful new capabilities that improve security and performance — while future-proofing HAProxy ALOHA to enable richer features and advanced functionality. With this release, we’re introducing HTTPS health checks to Global Server Load Balancing (GSLB), new partitioning for larger firmware updates, enhanced web application firewall (WAF) functionality, and our new Threat Detection Engine (TDE).

Top 11 Ruby APM Tools for 2025: A Performance-Driven Selection

Observability has become a core part of running Ruby applications at scale. Knowing how your app performs — from request latency to background job execution — helps catch slowdowns early and improve reliability. This blog walks through some of the most useful APM tools for Ruby in 2025. Each section highlights what the tool does well, where it fits best, and what kind of visibility it brings to your application's performance.

Single-Cloud Dependency Is a Disaster Waiting to Happen

The impact of the AWS outage has reminded many businesses of the risk for businesses that rely heavily on centralised cloud infrastructure, especially when so many essential services are concentrated in a single region. But at the wider industry level, this is also a warning around the widespread lack of contingency planning for cloud failures. Reactive response must give way to strategically planned disaster recovery protocols that engender a resilient cloud market.

Scale Chaos Engineering with Automation and AI

Chaos Engineering and Fault Injection testing have been proven to prevent outages, increase availability, and help companies avoid costly downtime. But without the right processes or tools, they require specialized knowledge, a deep understanding of systems, and manual effort for every test. To fully realize the benefits of Chaos Engineering, testing needs to be adopted across all engineering teams without causing a lift or investment that takes away from roadmap progress.

C15 Roadmap & Release 22

We’re excited to launch Release 22—our most advanced update yet. It delivers smarter controls, deeper customization, and long-term reliability. Key improvements include enhanced handling of TTY messages with Wireshark support, flexible call history recording, new Stir/Shaken override options for better traceability, and real-time call limit tracking with an upgraded interface. Plus, starting March 25, 2026, SIP code 603+ will notify callers when calls are blocked due to analytics, in line with FCC regulations.

Enhanced Flexibility and Security Monitoring - New in DataStream

This update delivers significant advances in operational flexibility and security monitoring capabilities. It addresses the evolving needs of security teams across diverse deployment environments, from air-gapped networks to those prioritizing automation and simplicity, while expanding integration options and improving visibility into data flows.

Fix flaky tests in your sleep with Chunk by CircleCI

A test fails. You rerun it and it passes. You shrug and move on. This is how most teams deal with flaky tests. The “rerun until green” approach works in the moment, and rerunning from failed tests is a useful way to confirm whether a failure is real. But reruns don’t fix the underlying issue. Over time, they burn CI resources and can hide real instability in your code. On the other hand, fixing flaky tests can mean hours of work.

What Is Business Continuity?

A single outage can stop operations, affect customers, and impact trust. In a world of pandemics, cyberattacks, weather events, and supply chain delays, your team cannot pray that something does not break. Business continuity drives your team to stay ready, recover earlier, and keep downtime lower. In this blog, we’ll explain what business continuity means, how to create a solid business continuity plan, and which approaches help teams keep operational during a disruption event.

Simple Talk Podcast - Coffee Chat with Lee Brownhill

Steve sits down with Lee Brownhill, who by day helps clients optimize their SQL workloads in Azure and AWS at Cloud Rede, but is also a Redgate Ambassador, blogger and aspiring speaker. Lee talks about his interest in giving back to the SQL Server community through writing and speaking, having taken inspiration from others online and in-person at events, and naturally the conversation also touches upon AI, the cloud, and more.

Data Centre Colocation: What UK Businesses Need to Know About Costs

As more UK companies go digital, many are missing critical cost factors when choosing colocation data centres, with location, power bills and regulatory compliance proving far more expensive than many anticipate. With insights from Pulsant, a digital edge infrastructure provider, we take a look at true cost of colocation.

Kubernetes For AI: The CTO's Guide

Kubernetes began as a tool to help teams keep thousands of microservices running without falling apart. It gave them a way to schedule workloads, recover from failures, and scale services without constant firefighting. Now, AI has brought back the same chaos, only magnified. Training jobs sprawl across GPUs. Inference traffic spikes without warning. Pipelines stretch across clusters, clouds, and compliance boundaries. Left unchecked, it can break both your workload and cloud budget.

The AI Productivity Paradox-and How We're Solving It

There’s a striking disconnect happening in software development right now. According to the 2025 Stack Overflow Developer Survey, 84% of developers are using or planning to use AI tools in their workflows. Over half of professional developers are using AI daily. The adoption is real, it’s fast, and it’s accelerating.

Resolve's Agents of IT podcast - Ep. 4 - Sean and Ari's Hot Takes

Welcome to Agents of IT, the show where we decode the future of enterprise automation and explore what it really takes to achieve Zero Ticket IT. In this episode, Sean and Ari share unfiltered takes on what’s broken in IT operations and how agentic automation is changing everything. From service desk overload to AI-driven resolution, we’re breaking down how IT can finally escape firefighting mode and focus on innovation.

Building Smarter: How AI is Changing Development

The tech industry is on the cusp of a revolution, driven by the rapid adoption of AI and Gen AI. At Civo Navigate London 2025, Josh Mesout (Chief Innovation Officer at Civo) explored the ways in which enterprises are leveraging these technologies to drive productivity and innovation. The conversation highlighted the challenges of scaling and running AI, including infrastructure bottlenecks and data access issues, but also showcased the potential for AI to transform industries and business models.

Progress Without Control in the Age of AI and Compliance

There’s growing unease in the database world regarding delivering at speed, raising the question – just how do we keep up with the pace of change without losing control of the things that matter most? AI is rapidly transforming the mechanics of how code is written, reviewed, and optimized which in-turn, increases the risk of destabilization.

Service disruption on October 20, 2025

When the internet goes down, our primary job is to help everyone get back up, as fast as possible. Of the almost half a million incidents we've helped our customers solve, there are some which stand out for both their scale and impact. One of these happened on Monday, October 20, when AWS had a widely covered major outage in their us-east-1 region, from 07:11 to 10:53 UTC. We’re hosted in multiple regions of Google Cloud and so the majority of our product was unaffected by the outage.

DORA is right: AI is an amplifier, for better or worse

The 2025 DORA report just surveyed nearly 5,000 technology professionals and delivered a verdict that should reshape how you think about AI investment: AI doesn’t create organizational excellence; it amplifies what already exists. For teams with solid foundations, AI is a force multiplier. For teams with broken processes and dysfunctional systems, AI magnifies the chaos.

Civo Navigate London 2025: Sovereignty & AI Highlights

We’re back from Civo Navigate London! Our seventh in‑person conference where sovereignty meets AI. Attendees walked away with fresh workshops, bold talks on data sovereignty, and hard‑won insights on how the cloud must evolve. Real‑world AI demos, cutting‑edge AI talks and authentic collaborations proved the cloud’s future is now. Conversations that turned into collaborations, governments asking the hard questions and developers swapping playbooks. The cloud isn’t a monolithic wall; it’s a thriving community that builds, learns, and defies the status‑quo.

Managing Freelance DevOps Work in 2025: Smarter Ways to Stay Online and Paid

Staying online is only half the battle for a freelance DevOps engineer. Keeping bills paid when clients delay payments or projects vanish overnight is the other. Unlike salaried tech roles, freelancers operate in a world where uptime isn't just for servers - it's survival strategy. When systems break, your name is on the line. But when the payment doesn't hit the account for weeks, nobody's there to reboot your budget.
Sponsored Post

47 Day Certificates Make Premium SSL Worthless

Your enterprise just paid $500 for an SSL certificate. You know what it does that a free one doesn't? Nothing. Absolutely nothing. And the 47 day certificate mandate hits, you'll pay that $500 to touch that cert eight times a year, per certificate. For the same encryption, same trust, same green padlock that Let's Encrypt gives away for free.

Mitmproxy vs Proxymock: Replaying Traffic for Realistic API Testing

Replaying traffic is a core tool in your toolbox when you need to reproduce a tricky bug or validate how your app behaves. Traffic replay is especially valuable for testing complex software applications that rely on APIs and microservices, where integration and functionality must be thoroughly validated.

Part 1: Building a Production-Grade Traffic Capture and Replay System

A few years ago I was on call during the Super Bowl. At the time I was working for an observability vendor and one of our customers had an outage caused by a surge in user traffic. But our monitoring system didn’t have enough data to know what went wrong and I sat on a call for 2 hours painfully listening to them spinning up more servers and trying to catch up with the user load.

The Future of AI: How Civo is Democratizing Access to Advanced Infrastructure

The world of cloud computing is undergoing a significant transformation, driven by the rapid adoption of Artificial Intelligence (AI). As AI continues to evolve and improve, it's becoming increasingly clear that access to advanced AI infrastructure is crucial for businesses to remain competitive. During Civo Navigate London 2025, Josh Mesout spoke about the importance of of AI for the future of cloud computing and how Civo is working to democratize access to advanced AI infrastructure.

4 Everyday IT Headaches You Can Eliminate with Enterprise IT Automation

Every IT operator anywhere on the team ladder dreads this feeling: another day, another flood of service desk tickets. Like cockroaches, they come in waves and they’re repetitive. Worse still, they distract your teams from higher-value work. Ironically for the amount of disruption they can cause, most of these tickets are not complex incidents or novel challenges. They’re the same everyday IT headaches your enterprise has been dealing with for years.

Build Vs. Buy? Why Creating Your Own Cost Management Platform Is Futile

The siren song of building a custom, internal cloud cost management platform is enticing. Many brilliant engineering teams are convinced they can come up with a bespoke solution that perfectly fits their needs. They look at their company’s unique infrastructure and decide they can DIY cost management without having to rely on an external vendor. Believe me, I get the temptation.

GitKraken Insights | Engineering Intelligence in Minutes

Most software intelligence tools take months to implement, cost a fortune, and end up collecting dust. GitKraken Insights is different. It helps engineering leaders measure what matters: AI impact, code quality, delivery performance, and developer experience, all in one place. It’s the latest evolution of the GitKraken DevEx platform, trusted by over 40 million developers. Insights connects data from across your GitKraken tools to give you a complete picture of engineering health and value. We're talking DORA metrics, pull request metrics, and AI impact.

Why Every Developer Needs Their Own AI Knowledge Base (It's Easier Than You Think)

Ever feel like you're drowning in documentation scattered across Confluence, Slack, Jira, and Git commits? Kyle Fransham, Senior VP of R&D at Superna, shares why every developer should run their own local LLM and shows you exactly how to do it. In this GitKon talk, Kyle reveals how to turn your personal "master document of knowledge" into a queryable AI assistant running directly on your laptop. No cloud dependencies, no organization bottlenecks...just your own development copilot that understands your unique workflows, tips, and tribal knowledge.

Sustainable Cloud Computing in the UK: Challenges, Opportunities, and the Future

The tech industry's environmental impact is a growing concern, but can collaboration and innovation drive sustainability? At Civo Navigate London 2025, Regent Lee, Dinesh Majrekar, Liam McTague, and Simon Morris explored the challenges and opportunities of reducing emissions in the tech industry.

Set up a live code editor in Next.js with CircleCI

Interactive playgrounds have changed the way developers learn and experiment with code. Instead of having to copy and paste code into a separate Read–Eval–Print Loop (REPL) or local environment, users can write, edit, and run code directly within the tutorial or application interface. Adding this type of editor to a Next.js app makes it more engaging and helps users understand better by eliminating the need to switch between different tools.

AI-Powered Translation Tools: A Hidden Asset for Scaling DevOps Globally

DevOps or development (Dev) and IT operations (Ops) teams are no longer confined to single geographic locations or language groups. With over 80% of organizations now practicing DevOps (a figure projected to reach 94% in the near future), the challenge of scaling operations globally has never been more critical. Yet, one persistent bottleneck continues to slow down even the most sophisticated DevOps workflows: language barriers.

Optimizing Enterprise Operations Through Professional Data Center Decommissioning

You ever walk into a server room that feels like a time capsule? Half the machines are blinking away like they're still running Windows Server 2003. Then there's a label on a rack that nobody remembers putting there, and someone swears the old backup tape drive "might still be needed." Yeah. That. It's wild how many large organizations run on infrastructure that's way past its prime. And I get it - no one wants to mess with systems that are still technically "working."

Debugging Without a Net: The Pain of Reproducing Production Issues

Every engineer has been there — a late-night page, a broken feature in production, and no clear way to reproduce it. The logs are vague. The metrics look normal. Your local environment works fine. Yet something somewhere is failing for real users. So begins the detective work — debugging a live system with almost no tools, no perfect test data, and no clone of production.

When AWS Goes Down: What It Means For Your Cloud Costs

A global outage at Amazon Web Services (AWS) did more than knock popular apps offline. It laid bare the cost risks embedded in many cloud architectures. As services fail, the hidden costs of high availability, from redundancy planning to recovery operations, often multiply. For cloud cost leaders, this isn’t an issue of uptime; it’s a visibility and budget-shock issue. It’s a key reminder that architecting for resilience involves difficult trade-offs.

Regain Control and Visibility of All IT Assets Across Your Organization

When you don’t have reliable processes for managing IT assets, you can quickly lose control. Asset inventories lose their accuracy, data across tools like CMDBs and spreadsheets stops matching reality, and no one can say with confidence what equipment is in use, where it’s located, how it’s connected, and whether it’s still needed. For data center professionals, a lack of asset visibility creates real risks.

Why GPUs accelerate AI learning: The power of parallel math

What makes GPUs so crucial for AI workloads? Is it just about raw processing power, or is there more to it? As we explore the world of AI infrastructure, understanding the role of GPUs is essential. Let's dive into the math behind AI. At its core, AI is all about mathematics, and matrix multiplication is a critical component. Whether you're training a model to recognize images or predict outcomes, the data is converted into massive arrays or matrices of numbers.

Secrets We Forgot... Until Automation Saved Us

We All Have That One Secret… That API key that has been sitting in production for ages. The personal access token that was supposed to be rotated 2 months ago. The service key that is about to expire… wait, when does it expire again? Most developers have experienced working with secrets. We create secrets, use them, and promise ourselves that we will rotate them. But somehow, the secret that was supposed to be rotated after 90 days is still standing strong after 6 months. Sounds familiar?

Announcing HAProxy Enterprise 3.2

HAProxy Enterprise 3.2 is a pivotal release that reinforces the product’s identity as both the world’s fastest software load balancer and a sophisticated edge security layer. This release brings next-generation security intelligence, extends its industry-leading performance, and expands the native routing and integration capabilities in HAProxy Enterprise.

How to test the reliability of a Point of Sale (POS) system

Point of Sale (POS) systems are the backbone of any retail store. A single outage can cost retail companies thousands of dollars each minute in lost sales, and even more if the outage happens during peak hours. If the outage goes on too long, it can cause even more costly damage as customers abandon carts and turn to competitors. In an industry where customer loyalty is worth its weight in gold, that brand damage can end up even more costly than the initial lost sales.

Ari Stowe, Resolve COO and Carla Ely of Grokstream speak at Innovate Americas Dallas

At Innovate Americas Dallas, Ari Stowe, Resolve COO and Carla Ely, Grokstream Channel Development Manager joined industry leaders to discuss how AIOps and agentic automation are reshaping IT operations for a Zero Ticket future. In this dynamic session, they explored how AI-driven event correlation, predictive remediation, and autonomous workflows are transforming how enterprises detect, diagnose, and resolve issues before they ever become tickets.

Data Sovereignty in the Age of AI: A Conversation with Kelsey Hightower and Mark Boost

Join Kelsey Hightower and Mark Boost at Civo Navigate London as they discuss sovereignty in the context of AI and cloud computing. The conversation highlights the need for a more nuanced approach to cloud computing, one that balances the benefits of public cloud with the need for control and sovereignty. The discussion emphasizes the importance of open protocols and the role of the community in driving innovation, and notes that the adoption of AI workloads is driving a shift towards more decentralized and sovereign cloud architectures.

dotConnect Universal - Unified Database Access for .NET Apps

Unlock true C# universal database connection capabilities with dotConnect Universal — a feature-rich ADO.NET provider that enables seamless data access across SQL Server, Oracle, MySQL, PostgreSQL, SQLite, Db2, and more. Chapters: Try it free for 30 days and see why thousands of developers trust dotConnect Universal in mission-critical environments.

AI-Native Software Delivery: Proven Practices to Produce High-Quality Software Faster

AI for Development Isn't New. AI for Delivery Is! AI coding assistants have transformed how teams create software. But innovation only delivers business value when code moves quickly and safely from commit to production and into customers' hands. In AI-Native Software Delivery, Harness Field CTO Nick Durkin and DevOps veterans Eric Minick and Chinmay Gaikwad present a practical guide to applying AI across the entire software delivery lifecycle.

The State of Software Engineering Excellence

Organizations everywhere are racing to modernize DevOps and elevate the developer experience, but how close are they to actually delivering?We surveyed over 650 engineering leaders to find out. The result is The State of Software Engineering Excellence 2025, a report that uncovers the hidden challenges, gaps, and opportunities shaping today's software teams.

Protecting Production from a VP

The rise of AI coding assistants means new "developers" in many organizations. Your safety nets (and on ramps) need to be built with citizen developers in mind. Innovation is moving faster than ever. That’s great. With AI code generation accelerating professional developers and the rise of "citizen developers" thanks to tools like vibe-coding, we’re seeing more code written today than ever before.

Kubernetes Security Guide: Risks, Strategies, And Tools

In 2018, attackers gained access to Tesla’s AWS cloud environment through an unprotected Kubernetes console (admin console). Because it lacked proper authentication, the hackers could see and control cluster resources. Once inside, they deployed new pods running cryptocurrency mining software, using Tesla’s compute power for profit. During the breach, the attackers also uncovered credentials stored in the cluster.

Onboarding Microsoft Sentinel data lake with DataStream

Modern security operations teams face an overwhelming challenge: a rapidly growing volume of logs, alerts, and telemetry from cloud services, on-premises infrastructure, and third-party security tools. Traditional SIEM platforms often struggle to scale cost-effectively and provide the agility needed for advanced analytics and threat hunting.

7 Ways Your Incident Management Just Got a Boost (New Feature Rundown)

All the things you may have missed that will make your incident management smarter, faster, and simply easier. We ship updates every week because we want you to get the most out of FireHydrant. But we also know it's hard to stay up to date and read every week's changelog (even though we know reading changelogs is the highlight of your week ).

Experimenting With Different Scripts

It all began when I spun up an AWS t4g.small burstable instance for a side project. Nothing unusual just another day in the cloud. But the moment I connected through SSH, something caught my eye. The system greeted me with a temperature reading of -273.5°C. Wait… what? That’s 0 Kelvin, the point where atomic motion completely stops. In other words, absolute zero , a state that’s theoretically impossible for anything to operate in.

What is autonomous validation? The future of CI/CD in the AI era

Over the past decade, CI/CD has redefined how modern software is built and shipped. CircleCI has been a leader in that transformation, working alongside the world’s best engineering teams to build a reliable foundation for continuous delivery at scale. Today, those foundations are under new pressure as AI reshapes every aspect of the delivery cycle. Developers are producing more change with less certainty about what those changes touch.

Zero Ticket Video Series with Resolve: Automating License Assign with RITA

See RITA, Resolve’s intelligent IT automation agent, in action as it handles one of IT’s most repetitive tasks: assigning and removing licenses. In this quick demo, RITA interprets a plain-language request, validates user details, and executes the license assignment automatically; no human intervention needed. Then, watch how just as easily it removes a license, showcasing true end-to-end automation that saves time, eliminates manual errors, and frees IT teams to focus on higher-value work.

Simplify Infrastructure Management with Puppet AI Infra Assistant #puppet #aiops #itautomation

Effortlessly access vital infrastructure data with Puppet Infra Assistant. Ask questions in natural language and get instant, contextual answers - no Puppet expertise required. Streamline operations, eliminate bottlenecks, and empower your team with self-service insights.

IaC is Great, But Have You Met IaCM?

This blog highlights the critical role of Infrastructure as Code Management (IaCM) in enhancing IaC practices, ensuring security, compliance, and efficiency in managing complex infrastructure at scale. ‍ Managing infrastructure efficiently and reliably is more critical than ever. Infrastructure as Code (IaC) has emerged as a key practice, enabling teams to define, deploy, and manage infrastructure using code.

AI at Scale: Developer Productivity & Engineering Excellence w/ Adeeb Valiulla | ShipTalk S4E4

How do you measure engineering excellence in the age of AI? In this ShipTalk episode, Dewan Ahmed talks with Adeeb Valiulla, leader of Developer Productivity & Efficiency at Harness, about redefining productivity beyond dashboards and vanity metrics.

Troubleshoot Faster with the New Log Search and Filtering in Qovery Observe

Following the launch of Qovery Observe, we’re progressively adding new capabilities to help you better monitor, debug, and understand your applications. Today, we’re excited to announce a major improvement to the Logs experience: you can now search and filter directly within your application logs.

Your "Technical Debt" is a LIE! Meet QA Debt.

The REAL reason your system WILL FAIL. We all talk about technical debt, but QA Debt is the silent killer costing companies millions. It's the accumulation of skipped regression checks, outdated test suites, and ignored production data. The result? Unpredictable, catastrophic outages that can sink your business (and your career!). Learn how to identify and pay down your QA Debt before it's too late. It's not about testing more it's about testing SMARTER.

Tech Talk #10 Building a VictoriaMetrics PaaS: Setting up Metrics, Logs, and Traces

From Blueprint to Reality This episode is designed to be a practical, step-by-step guide. We will show you how to leverage the VictoriaMetrics Kubernetes Stack—our "easier button"—to simplify the deployment process and get your components running quickly.

AI in the UK: A Panel Discussion on the Future of Artificial Intelligence

Join our AI panel discussion with Josh Mesout, James Faure, Abdul Hummaida, Jonas Vermeulen, and Daniel Miodovnik, as they come together to explore the rapidly evolving landscape of artificial intelligence. The conversation delves into the real-world applications of AI, its future trajectory, and the critical considerations surrounding security, privacy, and responsible AI practices.

SOC 2 Type 2: Netdata's Security Controls Validated Over Time

We’re excited to share that Netdata has successfully achieved SOC 2 Type 2 attestation. Following a five-month audit conducted by Sensiba LLP, we can now confirm that our security controls work consistently in practice. The audit covered the period from April 1 to August 31, 2025, and tested whether our controls operated effectively throughout that entire timeframe.

Automating Network Devices with NETCONF and YANG in Puppet Edge

Infrastructure teams manage not only servers and cloud resources, but also complex network environments. Routers, switches, and firewalls…often from multiple vendors with many models and versions. The devices require a consistent configuration and strict compliance enforcement to meet enterprise requirements.

Wait Class Monitoring for Oracle in Redgate Monitor

The latest release of Redgate Monitor visualizes Oracle wait classes and events across single-instance, multitenant (CDB/PDB), and Data Guard environments. It gives DBAs a clear, high-level view of where database time is spent and makes diagnosing and resolving performance issues in Oracle much simpler.

How Legal IT Can Escape the Graveyard of Recurring Tickets

It’s 3:30 p.m. A partner’s laptop refuses to authenticate to the VDI. The urgent filing is in two hours. The ticket title reads like a headstone you’ve seen a hundred times: “Can’t connect, tried rebooting, please help.” Another “undead” incident claws its way out of the queue. By home time, the backlog becomes a graveyard of recurring tickets, and your team, although brilliant and capable, is exhausted and applying the same fixes again and again.

Unlocking the Power of Sovereign Cloud: Insights from Civo Navigate London

As organizations increasingly rely on cloud computing to drive their businesses forward, a critical question arises: what happens when the cloud is no longer a trusted partner? This was a key theme explored at Civo Navigate London 2025, where Civo's leadership team shared their insights on the future of cloud computing and the growing importance of sovereign cloud solutions.

Can You Afford To Build Your Own Cloud Cost Optimization Platform?

So you’ve run into cloud cost challenges — incomplete visibility, overspending, thinning margins — and you’re wondering whether you have what it takes to build a cloud cost optimization platform in-house. It’s a responsible question (one that the FinOps Foundation even advises you to ask). Be warned: Building your own cloud cost optimization platform seems cheaper than adding another SaaS subscription. It’s not.

Introducing Braintrust: The exclusive network of engineering leaders shaping the future

In an era of unprecedented change, the role of an engineering leader has never been more demanding. Yesterday’s priority was scaling teams, while today’s is twofold: deploying AI responsibly and measuring the true impact of it all. The challenges are immense, and for the first time in a long time, the playbook is being written in real-time. That’s why today, we are thrilled to announce the launch of Braintrust, a new community from Cortex.

Top 9 APM Tools for Node.js Performance Monitoring

When a Node.js app slows down, you don’t get a clear picture right away. One service stalls, another spikes in CPU, and somewhere in between, requests start piling up. You can’t fix what you can’t see. Application Performance Monitoring (APM) tools close that gap. They capture request traces, latency, and errors across your stack — showing you what’s running slow and why.

Top 50 MySQL Interview Questions and Answers for Every Skill Level

Are you currently preparing for a technical role involving databases? MySQL interview questions and answers are a must-know. MySQL is one of the most popular relational database management systems, powering everything from small web applications to enterprise-level solutions. Whether you are looking to work as a database developer, data analyst, DBA, or in a related position, a solid understanding of MySQL is crucial.

MySQL Mocking with Speedscale's Proxymock: A Complete Guide

Testing database-driven applications is notoriously painful. If your app depends on MySQL, you’ve probably spent hours setting up local databases, running migrations, loading data, and then cleaning everything up just to rerun your tests. This repetitive cycle slows development, breaks pipelines, and introduces inconsistency between local and production environments.

Supercharge Developer Productivity with the New Harness Code Experience

Smarter PR Reviews: Inline comments, keyboard shortcuts, and faster diffs reduce context switching. Optimized for Scale: Instant file tree and change listing performance even in large monorepos. Seamless Navigation: Effortlessly move between branches, commits, and repos without losing context. Unified Design System: A consistent, intuitive UI across the entire Harness platform. At Harness, we know developer velocity depends on everyday workflow.

Designing for Failure: Choosing the Right Level of Redundancy, Resilience, and Control

Outages don't care how many zones you have. Power failures, software updates, and backbone disruptions all have one thing in common: they do not respect architecture diagrams. Redundancy only works if it is designed at the correct layer. Every team believes they are covered, and yet, when something breaks, the failure reveals that what looked like protection was only an illusion.

Understand how AI is affecting your engineering team with Cortex's AI Impact Dashboard

Rolling out a powerful AI tool like GitHub Copilot is a big win for any engineering leader. But because it’s such a significant investment, leadership will inevitably ask if it was worth the cost. Until now, answering this was nearly impossible. While GitHub provides adoption stats, connecting that data to real-world performance metrics like cycle time or code quality has been a manual, frustrating process. We built the Cortex AI Impact Dashboard to provide a clear answer.

Your metrics, your way: Announcing custom views in Engineering Intelligence

Every engineering organization measures success differently. A dashboard that’s perfect for one team might be meaningless for another. While out-of-the-box views for DORA are a great starting point, leaders need the ability to define and share the specific combination of metrics that matter most to their business. Without this, you're either forcing your teams to conform to generic reports or wasting time rebuilding the same views every week.

Announcing the AI chief of staff for engineering leaders

You see MTTR creeping up, but you don’t know why. You could ask your teams, but that means meetings, pulling people off projects, and waiting days for answers. What if you could just…ask? We’re excited to introduce the new strategic AI chief of staff for engineering leadership, powered by the Cortex MCP. By connecting your Engineering Intelligence data with your scorecards and standards, the MCP allows you to have a strategic conversation about your organization’s performance.

Implement Distributed Tracing with Spring Boot 3

A slow checkout request. A background job stuck waiting on another service. A log message that looks fine — until performance drops. In a Node.js microservices setup, these are the moments that test your observability. You know something's wrong, but tracing the request across dozens of services feels impossible. Distributed tracing changes that. It connects every span in the request's journey, showing exactly where time is spent and where things start to break down.

Built for More: Unlocking a Sovereign and AI-Driven Future

Join us as Civo's leadership team, including Mark Boost, Deshish Majrekar, Kelsey Hightower, and Josh Mesout, come together to share their vision for a new kind of cloud that prioritizes data sovereignty, simplicity, and flexibility. This session explores the challenges of traditional cloud infrastructure and how Civo is addressing them with its cloud native platform. Civo's Flex Core private cloud solution is also showcased, highlighting its applications in various industries, including cybersecurity and advanced materials design.

Console Connect strengthens Google Cloud connectivity with new global locations

Console Connect has expanded its cloud ecosystem with four additional Google Cloud locations, bringing our total to 69 locations worldwide. This growth across three continents gives customers even more options to directly and securely connect to Google Cloud Platform (GCP) from strategic data centre hubs in key international markets, helping enterprises simplify and scale their direct access to GCP wherever they operate. New GCP locations include.

The Anti-Zombie, Battle-Tested Guide To AI FinOps: 10 Insights

When CloudZero’s CTO Erik Peterson joined the FinOps Weekly podcast in October 2025, he didn’t hold back. Instead of going on about the usual best practices of AI cost optimization, he posed challenges to how we approach AI spending. From “zombie AI experiments” eating your budget to why you should stop apologizing for using AI, these 10 insights from the podcast are worth considering in how we approach AI FinOps. (Watch the full podcast below and keep reading for more!)

What Are Kubernetes Nodes? Everything You Need To Know

A key advantage of Kubernetes for container management is its high scalability. Kubernetes nodes are directly involved in this, and they can significantly impact your efficiency, cost-effectiveness, and service availability. This guide provides an in-depth look at Kubernetes nodes, including types of nodes and operational best practices.

Implementing image recognition with React and continuous deployment

Integrating artificial intelligence (AI) into web applications can significantly enhance user experience. AI offers features like image recognition to process and analyze user-uploaded images. Combining this with a robust continuous integration and continuous deployment (CI/CD) pipeline using CircleCI ensures seamless updates and reliable delivery. In this article, you will learn how to build a React app that uses TensorFlow.js for client-side image recognition and set up automated testing with CircleCI.

Building LLM agents to validate LangGraph tool use and structured API responses

Transitioning LLM agents from intriguing prototypes to reliable, production-grade solutions introduces a unique and significant challenge: the inherent stochasticity of LLMs. Unlike conventional software, where inputs predictably yield precise outputs, an LLM’s response can exhibit variability even when presented with identical prompts. To ensure the dependability of your LLM agent, you will need a rigorous validation strategy.

Industry Reports Agree: DevOps is the Key to Unlocking AI's Potential

Recent industry research shows that AI is accelerating code creation, but having mixed results downstream. They also show that better platforms and pipelines yield better outcomes for teams adopting AI for coding. Every engineering leader I talk to is asking the same questions about AI coding assistants: How much faster can we ship? How much more productive can my developers be? On the surface, the answers look pretty good.

Eng Intelligence | Cortex

Unlock the power of Engineering Intelligence with Cortex. In this video, we explore how engineering teams can leverage data, insights, and best practices to improve developer productivity, accelerate software delivery, and scale platform engineering. What you’ll learn in this video: Cortex delivers visibility across your entire ecosystem—helping teams adopt best practices, reduce bottlenecks, and align engineering with business outcomes.

Introducing Magellan: The AI data engine that builds your IDP

Building a catalog used to be a project. It meant months of tracking down owners, untangling dependencies, and manually piecing together a picture of your architecture. It was a tedious, thankless process that delayed the value of your Internal Developer Portal (IDP) before you even got started. Now, it’s a coffee break. We’re excited to introduce Magellan, our new AI-powered data engine designed to build your catalog and get your IDP live in minutes.

A new era for your developer portal: The Cortex MCP is now generally available

Here's a scenario every on-call engineer knows too well: a critical incident fires for a service you’ve never seen before. Your first ten minutes are a frantic scramble across wikis and Slack channels just to answer the most basic questions: Who owns this? What does it do? Where are the runbooks? By the time you’re oriented, the incident has escalated.

Your Password Reset Workflow Is Wasting Everyone's Time

Let’s not mince words; there’s a special place in hell for the password reset ticket. It’s the most boring, most avoidable, and arguably the most expensive waste of time on your service desk. And yet, in 2025, most enterprises still treat password resets like it’s 2005. They route them through manual queues, bury IT teams, and frustrate users who just want to log back in. Even when the password reset is finally resolved, nobody comes away from the experience feeling like a winner.

Choosing the Right APM for Go: 11 Tools Worth Your Time

If you’re building high-performance systems, Golang has probably earned a spot in your stack. Its speed, lightweight concurrency, and quick compile times make it ideal for scalable APIs, microservices, and distributed systems. But those same qualities that make Go powerful can make performance monitoring tricky. Goroutines run fast and in parallel, which means a simple CPU or memory graph doesn’t always tell you what’s slowing things down.

What Is SolarWinds, And Should You Use It?

Downtime is brutally expensive and damaging. Enterprises can lose about $9,000 every minute systems are down, while smaller businesses lose hundreds of dollars per minute. A single outage can often cost over $100,000, and nearly a third of companies lose customers due to downtime. That’s why many organizations turn to platforms like SolarWinds to maintain reliable systems and minimize the risk of costly disruptions.

Could AI Turn Back The Clock On IT Departments?

I recently wrote about the impending SaaS crisis, driven by companies’ newfound ability to use AI to build software they used to have to buy. I predicted this phenomenon would make it even harder for SaaS vendors to drive growth, and that elite SaaS margins would fall from the mid-70s to the mid-60s as companies leaned more into their data and AI.

10+ Continuous Testing Tools To Help You Ship Quality Software

In July 2024, CrowdStrike — one of the world’s top cybersecurity companies — shipped a minor configuration update to its Windows security product. Within minutes, airlines, banks, hospitals, and retailers worldwide began crashing. The update wasn’t new code. It was a routine content file that slipped through with a bug in its safety checks. When Windows machines loaded it, the agent hit an out-of-bounds memory error and crashed. Devices blue-screened and got stuck in reboot loops.

How engineering leaders can adopt and lay the foundation for AI with confidence

AI is transforming how software is written and operated. Every day, engineering teams are discovering new ways to accelerate development, reduce toil, and push the boundaries of innovation. But this acceleration makes it easy to forget a fundamental truth: speed without guardrails creates risk, especially when implementing the AI-powered tools that dominate today's news cycles.

The future of IDPs in an AI-first world

Over the last few months, I’ve had countless conversations with my peers about one topic: the rise of AI coding assistants. I know this isn’t exactly breaking news, and I’m sure you’ve had these conversations as well. But there’s a reason the common coffee chat today is 10 percent small talk and 90 percent about the AI-first world that we live in. Tools like GitHub Copilot, Cursor, and Devin are fundamentally changing how we write software.

Expanding Your Infrastructure Automation Across the Lifecycle Using Puppet Edge

Infrastructure automation is evolving… and so is Puppet! While Puppet has long been known for its strength in Day 2 operations through agent-based desired state configuration, Puppet also extends across Day 0 and Day 1 tasks. With Puppet Edge, you can target network devices alongside your existing infrastructure, enabling your teams to manage more scenarios, more devices, and more workflows. All from a single platform.

Onboarding with Cortex

Onboarding to Cortex has never been easier. In this video, discover how Cortex and Magellan, our AI-powered data engine, make it possible to go from a blank slate to a fully functional internal developer portal in just minutes, not months. What you’ll learn in this video: With Cortex, you can skip manual setup and start exploring your connected engineering ecosystem immediately. From day one, you’ll have visibility into services, ownership, and metrics, empowering your teams to drive standards and deliver value faster.

A 5-step enterprise guide to IaCM

As enterprises scale, Infrastructure as Code (IaC) alone isn’t enough to manage the growing complexity of modern infrastructure. Infrastructure as Code Management (IaCM) elevates IaC into a strategic, governed, and automated framework, providing centralized control, compliance, and collaboration at scale. Platforms like Harness IaCM make this transformation practical, turning infrastructure chaos into consistent, secure, and efficient operations.

Cloud vs Colocation: strategic infrastructure choices for long-term value

Organisations are no longer limited to running everything in-house. The real question is where different workloads should sit to deliver the best long-term results. For many, that means weighing up cloud vs colocation. Both offer advantages over traditional infrastructure but serve different aims. Cloud makes it simple to scale and launch new services, while colocation provides predictable costs, direct control, and stronger assurance for compliance.

The DIY Warning | Unlocking SD-WAN's True Role in Your Digital Transformation

Thinking about a DIY SD-WAN? Before you do, consider the hidden costs: 24/7 operational overhead, evolving security threats, and complex compliance regulations. In this clip, our experts break down why many enterprises are choosing managed SD-WAN services to avoid these pitfalls and focus on innovation instead of firefighting. Is your network truly secure and scalable?

10 Critical Factors to Consider When Choosing a Colocation Provider

Colocation remains one of the key ways for businesses in Europe and the United States to host their corporate IT infrastructure. Companies place their equipment in a provider's data center to gain industrial-grade reliability, round-the-clock support, and access to high-speed networks - all while maintaining full control over configuration and security.

QA Debt: The Silent Risk That Can Take Down Your Business

In engineering, we talk a lot about technical debt — the shortcuts and compromises made in code that pile up over time. But there’s another kind of debt that’s just as dangerous and far more invisible: QA debt. QA debt is what happens when testing isn’t given the same attention as features, architecture, or performance. It’s the accumulation of missed edge cases, outdated test suites, incomplete automation, or skipped regression checks.

The Performance & Cloud Pain Point | Unlocking SD-WAN's True Role in Your Digital Transformation

You've invested in high-quality internet, but your cloud apps are still lagging. The problem might not be your connection, but the path your data takes to get there. Discover how SD-WAN intelligently routes traffic to prioritise critical applications, ensuring a seamless user experience for your distributed workforce. Curious about how SD-WAN can address cloud performance challenges?

How OpenTelemetry Auto-Instrumentation Works

Most developers use auto-instrumentation as it’s meant to be used — run the Java agent, add NODE_OPTIONS, and telemetry starts flowing. When it stops, though, figuring out why can be tricky. Maybe the agent didn’t load, maybe there’s a framework version mismatch, or something else entirely. Understanding how auto-instrumentation works makes it easier to spot and fix these issues.

15 PHP APM Tools Worth Using in 2025

PHP powers a large swath of the web — from blogs to storefronts to APIs. But with microservices, third-party dependencies, and scaling complexity, performance can slip in subtle ways. Your app might mostly work, but small—noted delays, occasional spikes, or hidden bottlenecks build up. An APM tool helps you see inside the black box: which functions are slow, which DB queries are hogging time, which external calls are failing or stalling.

Navigating AI transformation ft. Meg Adams, Senior Director of Engineering at The New York Times

In this episode of The Confident Commit, Rob Zuber sits down with Meg Adams, Senior Director of Engineering at The New York Times, for a deep dive into leading engineering teams through the AI revolution while staying true to organizational mission. Meg shares how the Times approaches AI adoption with a "measured but focused" strategy, emphasizing experimentation and opinion-formation over mandates, and why she believes AI serves as a force multiplier for what already exists in your organization and workflows.

The new AI-driven SDLC

For decades, the software development life cycle (SDLC) has been the framework teams use to understand how software moves from idea to production. It breaks complex work into familiar phases: planning, design, development, testing, deployment, and maintenance. This structure gave organizations a shared way to coordinate teams, track progress, and build with confidence.

The Silent Leak: How One Line of Go Drained Memory Across Thousands of Goroutines

This technical deep-dive reveals how Harness engineers discovered and fixed a critical Go memory leak where reassigning context variables in worker loops created invisible chains that prevented garbage collection across thousands of goroutines, ultimately consuming gigabytes of memory in their CI/CD delegate service.

Bridging the Gap Between Finance & Engineering: The Harness Playbook

Most cloud waste isn’t technical, it’s organizational. Harness brings finance and engineering together with FinOps practices that connect spend to outcomes, not blame. The result: 30%+ savings and alignment that scales. In too many cloud organizations, finance and engineering operate like two planets in orbit. Finance speaks in forecasts, budgets, billing codes. Engineering speaks in uptime, latency, error rates. The result?

How to Scale Prometheus APM for Modern Applications

When developers monitor application performance, they pick one of two paths: traditional APM tools with distributed tracing and code profilers, or metrics-driven monitoring with Prometheus. The second approach — Prometheus APM — tracks the signals that matter most: request rates, error rates, latency, and resource utilization. No agents to install, no per-host pricing, just exporters and PromQL. For most teams, Prometheus APM is where monitoring starts.

Console Connect Ecosystem Update October 2025

In this ecosystem update, we share details of the latest additions to the Console Connect platform. This includes our expansion into Thailand with 13 new enabled data centre locations plus new cloud on-ramps available globally, giving you greater flexibility and choice to connect worldwide. Thailand is one of Asia’s fastest-growing digital hubs.

Zero Ticket IT with Resolve: Automating Email Distribution List Management with RITA #agenticai

Say goodbye to service desk tickets for simple access requests. In this Zero Ticket IT demo, watch how RITA, Resolve’s AI-powered virtual agent, automates the process of adding and removing users from email distribution lists in seconds.

Cloud Cost Allocation Requires Both Ease And Power

From the time I joined CloudZero almost two years ago, one thing was unmistakably clear: the team here had built the most powerful cost allocation engine in the FinOps industry. It was also clear that this power came with something of a tradeoff. While engineers and technical FinOps practitioners loved what our YAML-based CostFormation engine could do, the experience wasn’t always simple for non-technical people getting their hands on the product for the first time.

The Challenges Of Allocation: Why Your Cloud Cost Dashboards Fail When You Need Them Most

Your CFO just asked why cloud costs jumped 40% last quarter. You pull up your dashboards, confident in your tagging strategy, until you realize 30% of your spend is labeled ‘unallocated’. This is the moment every FinOps team dreads. It reveals a hidden complexity that even mature organizations struggle with: the challenge of accurately allocating cloud costs in a way that’s both granular enough to be actionable and trusted enough to drive decisions.

How image generation models are creating new infrastructure demands for DevOps teams

The rapid adoption of generative AI has moved far beyond research labs and creative studios. Image generation models, in particular, have become critical components in content production pipelines, marketing platforms, design workflows, and enterprise applications. What began as a novel way to create digital art has evolved into a class of workloads that behave very differently from traditional web services.

Is It Time to Ditch Your VPS? Here's What Nobody Tells You

Let's be honest - if you're reading this, you've probably been riding the VPS train for a while. It's been good to you. Affordable, flexible, not too demanding. Like that reliable hatchback you drove through college. But now your project's grown. Traffic's up, your app's more complex, and you're starting to feel those little hiccups - slow response times, random crashes, maybe even a few panicked server reboots at 3 a.m. Sound familiar?

GitKraken Desktop 11.5: We Fixed What Mattered Most

GitKraken Desktop 11.5 delivers massive performance improvements where they count most, opening repos up to 5x faster, stash refreshes 100x faster, and branch/tag loading 100x faster. No workflow changes required. Just measurably faster Git operations that give you back your time and flow. Ready to see it in action? Check out the Youtube Tutorial below. We need to talk about something that’s been frustrating many of you: performance.

Resolve's Agents of IT Podcast - Avro Chatterjee at Altus Group: Technology is Always Lightspeed

In this episode of Agents of IT, we’re joined by Avro Chatterjee, Director of Site Reliability Engineering and Tools at Altus Group, to unpack what it really takes to transform IT from a service provider into a strategic business partner.

Automating IT Configuration Monitoring with Puppet Enterprise

Discover how Puppet Enterprise simplifies configuration monitoring and ensures compliance with industry standards like NIST2, DORA, and ISO 27001. This video provides a quick overview on how to automate compliance checks, detect drifts, and maintain a secure IT infrastructure effortlessly.

#050 - Data Protection and Kubernetes Resilience with Michael Cade & Julia Furst Morgado (Veeam)

In this episode Itiel hosts Veeam experts Julia and Michael, to share their distinct paths into cloud-native technology. Julia discusses her transition from a background in law and marketing to becoming a CNCF ambassador and AWS container hero. Michael, a veteran who has been with Veeam for over 10 years, details his traditional CIS admin background (virtualization, storage) and the evolution of this role into platform engineering.

The Best Tools for Synthetic & Infrastructure Monitoring-A Comparative Guide

Both user and server-side monitoring are important to make your apps better. Tools that offer monitoring of just one side leave gaps in your diagnosis, causing negative experiences and reliability issues. Here are the top 10 tools you should consider based on their benefits and coverage.

Automating Expo app build delivery to QA with CircleCI and EAS webhooks

Manually sharing mobile app builds with Quality Assurance (QA) engineers can be a tedious and error-prone process. Developers often find themselves exporting.apk or.ipa files, uploading them to Google Drive or Dropbox, and then pinging the QA team on Slack to announce the upload, all while juggling deadline and code reviews. This manual process not only slows down feedback cycles but also leaves room for human error, miscommunication, or outdated builds being tested.

Enhancing JFrog Internal Operations with Near Zero Downtime Migration

Data migrations have long been a significant source of anxiety for businesses and IT teams alike. The thought of moving critical databases often conjures images of prolonged downtime, service interruptions, and the ever-present risk of data loss. Indeed, statistics show that “90% of businesses experience unexpected downtime during database migrations, leading to significant revenue loss and customer dissatisfaction”.

Cloud parity, AI, and sovereignty: what we announced at Civo Navigate London 2025

If you could run public and private cloud like they were the same place, what would that unlock for your teams? That was the core message of our Navigate London keynote, and it matters just as much to readers in Singapore or Seattle as it does to those who joined us in person. For years, the industry has made you choose. Public cloud on one track. Private cloud on another. Different APIs, different skills, different bills, and a rewrite every time you move.

Why Cloud Managed Data Center Services Are Having A Moment

The obituary for the data center was written too soon. While the cloud dominates today’s IT headlines, traditional infrastructure hasn’t disappeared. It is evolving. Enterprises still rely on data centers for control, compliance, and reliability. However, they are increasingly needing the agility, scalability, and cost visibility that the cloud promises. Cloud managed data center services are bridging this gap.

Server Configuration Mistakes That Sabotage Automated Trading Performance

Most traders blame their strategies when EAs underperform in live markets compared to backtests. Yet in my experience analyzing hundreds of failed trading setups, roughly 60% of performance issues stem from server configuration problems rather than algorithmic flaws. During the recent volatility spikes around central bank announcements, I watched sophisticated grid trading systems collapse not because of poor logic, but because their hosting environments couldn't handle the computational load when it mattered most.

Real Estate App Development for Ops & Product Teams: From MVP to Scale

In the competitive world of real estate technology, developing an app that can scale from a Minimum Viable Product (MVP) to a fully-fledged solution is crucial. For operations and product teams, this journey involves strategic planning and execution to ensure the app meets evolving market demands and user expectations.

Testing AI Code in CI/CD Made Simple for Developers

Generative AI can produce code faster than humans, and developers feel more productive with it integrated into their IDEs. That productivity is only real if CI/CD tests are solid and automated. When not appropriately tested, you may encounter a production issue that you haven’t seen before. According to the State of Software Delivery 2025 report, 67% of developers spend more time debugging and resolving security vulnerabilities in code generated by AI.

AWS CloudFormation Pricing Breakdown (And How To Save)

Nearly every industry today uses AWS for different services. Developers, cloud architects, DevOps engineers, and IT teams all use it to provision servers, databases, and storage. However, doing this service by service and then wiring them together can get messy. That’s where AWS CloudFormation comes in to save time, enforce consistency, and lower the risk of misconfigurations. But beyond simplifying infrastructure management, one big question remains: at what cost?

Introducing Dimension Studio: Easier, Faster Cost Allocation In CloudZero

Today, we’re making CloudZero even better with the launch of Dimension Studio, a major evolution in how CloudZero customers create and manage Dimensions — customizable “lenses” that allocate cloud and AI spend to relevant categories like products, features, teams, or customers, without relying on resource tags. At CloudZero, our mission has always been to help organizations make sense of their cloud and AI spend.

What is API-First Networking?

When you build your network with APIs at its core, you give your business a competitive edge. Here’s how to do it. Application Programming Interfaces (APIs) have become ubiquitous with modern networks for good reason. As companies use more service providers, endpoints, and software platforms, APIs help them get the most possible utility from their data with the least possible effort.

Chaos Engineering works, but it has to scale

Over the years, Chaos Engineering has proven its effectiveness time and time again, uncovering risks and saving companies millions they would have lost in painful, brand-impacting outages. But as Chaos Engineering adoption increased, we found organizations running into the same stumbling blocks when they tried to scale. Individual teams would get great results with Chaos Engineering, then stall as they tried to get more teams involved.

Redis Performance Monitoring: Combine Logs and Metrics for Complete Visibility

Redis earns its place in modern stacks because it’s an in-memory data store with microsecond latency and rich data structures, making it perfect for things like caching, sessions, and rate limiting. Since it often sits on the request path, small issues (connection churn, blocked commands, memory pressure) can quickly ripple into user-visible incidents.

Building and deploying a Python MCP server with FastMCP and CircleCI

Extending Large Language Models (LLMs) with custom tools has become increasingly valuable in today’s AI landscape. Model Context Protocol (MCP) servers provide a standardized way to connect external tools and resources to LLMs. This can enhance their capabilities beyond basic text generation. While thousands of pre-built MCP servers exist, creating your own allows you to address specific workflows. You can implement use cases that off-the-shelf solutions cannot handle.

GitKraken Desktop 11.5 Release: Performance Upgrade

Back to basics. Back to speed. GitKraken Desktop 11.5 is about fixing what was slowing you down. Large repos now open in seconds, stashes refresh instantly, and repos with thousands of refs load without breaking a sweat. Highlights: This release is about speed, reliability, and bringing GitKraken back to its core: a Git client that just works. And works fast. Welcome to 11.5.

GitKraken CLI Tricks You Need to Know!

Managing multiple repositories shouldn't mean running the same Git commands dozens of times. GitKraken CLI brings multi-repo actions, unified Git workflows, and standardization to command line developers who need bulk operations across their entire workspace. In this GitKon presentation, Louis Silvio (Software Engineer & Cloud Architect at GitKraken) demonstrates how the GitKraken CLI solves the context-switching chaos that slows down modern development teams.

Private Cloud: The Future of Cloud Sovereignty

For a long time, public cloud has been the default answer to scaling infrastructure, but it's not the only path forward. As more teams weigh the risks of vendor lock-in, data residency, and dependence on US-based providers, the conversation around private cloud has taken on new urgency. However, building on private infrastructure doesn't have to mean sacrificing flexibility.

How the Right Enterprise Process Automation Software Empowers Zero Ticket at Scale

When you hear "enterprise process automation software," your mind might jump to IT workflows or yelling “agent” into the void of an AI phone tree. But the reality is that this software is not just for IT, and neither is the Zero Ticket vision it enables. If your business has ever struggled with clunky internal handoffs, repetitive requests, or delays caused by manual approvals and triage, you’re dealing with the same inefficiencies that Zero Ticket IT is built to eliminate.

Downtime on the Docket: The Death Sentence for Productivity in Legal Firms

When minutes matter, IT leaders need more than quick fixes; they need foresight. That’s where Teneo’s Managed DEX (Digital Experience Monitoring) comes in. Managed DEX is designed to detect what legal teams can’t afford to miss. It monitors for “ghost traffic”- those eerie, unexplained signals of abnormal network activity that often signal compromise or instability- and other anomalous device behaviors that can precede full-blown outages or cyber incidents.

4 Best Hostinger Alternatives Worth Switching To

Web hosting providers serve different needs based on website requirements, budget constraints, and technical specifications. Hostinger has gained attention for its budget-friendly pricing, though some website owners seek alternatives for various operational reasons. The following four hosting providers offer distinct features and pricing structures that may align better with specific hosting needs.

The 47-Day Certificate Ultimatum: How Browsers Broke the CA Cartel

For twenty years, Certificate Authorities ran the perfect protection racket. The CAs had a beautiful monopoly. Browsers needed them to keep users safe. Websites needed them to look legitimate. Everyone paid up, nobody asked too many questions. Then the cryptography of most certificates (SHA-1) got shattered, and the browsers realized they’d been played.

Announcing Cloud 66 Deploy v3

We’re excited to announce the rollout of Cloud 66 Deploy v3 — a major evolution of our deployment platform built from the ground up for flexibility, modern workloads, and future growth. While Deploy v2 has served many teams well over the years, Deploy v3 brings a fresh architecture and new capabilities designed for today’s cloud-native needs. Here’s what you need to know.

How Redgate's Foundry is Shaping the Future of Database Innovation with AI

Learn how Redgate’s Foundry drives AI innovation in database management - from intelligent monitoring and ML-based automation, to smarter SQL optimization. In today’s rapidly evolving database landscape, innovation is essential. With the rise of artificial intelligence (AI), machine learning (ML), and automation, database management is undergoing one of its most significant transformations in decades.

Automated RAG pipeline evaluation and benchmarking with RAGAS

Retrieval-Augmented Generation (RAG) pipelines have become an integral part of how Large Language Models (LLMs) access information beyond their training cutoff. These pipelines enable LLMs to deliver current, accurate, and grounded responses. By fetching relevant external documents, RAG mitigates common LLM challenges like factual inaccuracies and hallucinations. However, this methodology introduces a new complexity: evaluating RAG pipeline performance is particularly challenging.

Bitbucket Dynamic Pipelines Creation and Deployment | Bitbucket Blitz | Atlassian

In this video, I introduce Bitbucket's Dynamic Pipelines. By watching this video, you'll learn how to create a Dynamic Pipeline using Atlassian's Forge tool and deploy it to your Bitbucket Cloud site. About Atlassian: Behind every great human achievement, there is a team. From medicine and space travel to disaster response and pizza deliveries, we help teams all over the planet advance humanity through the power of software. Our mission is to help unleash the potential of every team.

How I Made N8N Reliable With VPS Hosting in Europe

I've hosted n8n pretty much everywhere. Render, Railway, DigitalOcean droplets, even a self-inflicted Docker swarm that still haunts my sleep. Every time, something broke in a way that made me question my career choices. Cron jobs stopped firing, memory usage climbed like a fever, and webhooks randomly died at 3 a.m. Eventually I gave up and moved everything to a plain VPS. That's when things stopped being... stupid. If you're thinking about doing the same, get an n8n VPS hosting plan and save yourself the pain.

NHibernate vs ADO.NET: Which Is Better for .NET Development?

NHibernate vs ADO.NET is the classic clash in.NET development: raw SQL muscle on one side, high-level abstraction on the other. One promises speed and precision, the other productivity and cleaner code. For most.NET teams, the real challenge here is determining which approach best suits their project’s scale, timeline, and goals. That choice directly influences database efficiency, developer productivity, and long-term stability.

[WEBINAR] From Blueprint to Production: A Live Workshop for Creating an MCP Server for Kubernetes

In this hands-on workshop, we covered how to build your own MCP server from scratch and connect it to AI tools like Cursor IDE or Claude Desktop. The first half is a live coding session you can follow along with to set up an MCP server for Kubernetes troubleshooting. In the second half, we take you behind the scenes at Komodor to show how we built our MCP Server MVP: a powerful bridge between AI assistants and Kubernetes infrastructure. This is just part of the 'magic' that helps the Klaudia agentic AI technology power Komodor's AI SRE Platform.

How to Use Synthetic Monitoring in CI/CD Pipelines

CI/CD pipelines are the heartbeat of modern software delivery. They automate builds, run unit tests, package applications, and deploy them to production with a speed that traditional release cycles could never match. For engineering teams under pressure to move fast, pipelines are the mechanism that makes agility possible.

FluxCD vs. ArgoCD: Why Qovery is the Better Way to Do GitOps

Dive into the ultimate FluxCD vs. ArgoCD debate! Learn the differences between these top GitOps tools (CLI vs. UI, toolkit vs. platform) and discover a third path: Qovery, the DevOps automation platform that abstracts away Kubernetes complexity, handles infrastructure, and lets you ship code faster.

Fargate Simplicity vs. Kubernetes Power: Where Does Your Scaling Company Land?

Is Fargate too simple or Kubernetes too complex for your scale-up? Compare AWS Fargate vs. EKS on cost, control, and complexity. Then, see how Qovery automates Kubernetes, giving you its power without the operational headache or steep learning curve.

The Rule Of 40: How To Calculate And Use It For SaaS

Many SaaS businesses prioritize customer acquisition and retention over increasing gross margins, particularly during the startup and scale-up stages. It makes sense. A company can accelerate its revenue growth by acquiring and retaining more new customers, rather than simply selling more to existing ones. Yet, here’s the thing. Revenue growth measures the increase in the amount of money a business earns from sales.

Switching from Jenkins to Bitbucket Pipelines | Bitbucket | Atlassian

This webinar presents the case of a customer who migrated from Bitbucket Data Center and Jenkins to Bitbucket Cloud and Bitbucket Pipelines. The customer migrated approximately 90 repositories and significantly reduced their operating costs. The webinar also briefly introduces the Atlassian migration tool for Jenkins that can convert Jenkinsfiles to bitbucket-pipelines.yml files.

Observability vs. Visibility: What's the Difference?

In modern IT systems—distributed services, cloud-native platforms, and dynamic networks—just knowing that something is “up” isn’t enough. Green checkmarks on dashboards don’t tell you why performance shifted, why latency crept in, or why a perfectly healthy-looking service suddenly failed. This is where the conversation around visibility and observability begins. They sound similar, but they solve very different problems.

The Hidden Cost of Running Cloud-Hosted SD-WAN for IaaS

There are three common ways to connect your branch locations to the cloud. We break down the benefits and limitations of each. Many enterprises are executing a strategy of cloud services first, followed by application modernization – including SD-WAN. Moving toward a cloud-native architecture using containerization, microservices, or serverless architectures like SD-WAN can lower costs over time, increase scalability, reduce development cycles, and speed up time to market.

Data Integrity in a Database: How to Ensure Accuracy and Security

Data integrity in a database is the backbone of accurate decision-making, regulatory compliance, and organizational security. Without it, even the most advanced analytics or AI models are built on shaky ground. Imagine going through your database and noticing that the numbers in your sales report are inconsistent, financial data tells three different stories, and some customer records are missing. This mismatch can compromise business decisions and lead to missed opportunities.

7 ways AI agents are transforming software delivery

For most teams, the slowest part of delivery isn’t writing code, it’s everything that happens after: automated tests, manual reviews, bug fixes, final approvals, and the long wait for deployment. The longer these phases run, the more expensive and painful late fixes become. As AI makes it easier to generate code at scale, those bottlenecks only get bigger.

3 things you can do to get closer to five nines

5 minutes. That’s how much downtime some of the world’s largest enterprises will tolerate. For most organizations, five nines (99.999%) of availability sounds like a pipedream. But the trick to increasing availability isn’t massive infrastructure spending or complex system redesigns. All it takes are three key practices that any team can adopt and implement. In this post, we’ll present these practices and how we implement them at Gremlin.

Frog-Proof Security: Streamlining The Sec In DevSecOps

What’s in store for Software Supply Chain security in 2026? With the types of software entering organizations ever-changing, and the volume ever-increasing, DevSecOps teams are facing new, and complex questions at macro and micro levels: How can teams effectively control and curate what enters systems? How can remediation be accelerated, while ensuring accuracy? How will the rising use of AI impact our threat landscape and can DevOps and Security teams truly share ownership of this emerging reality without adding friction?

Qovery the new standard for DevOps automation

Qovery is a DevOps automation platform that lets developers deploy and scale applications on their cloud by simplifying and automating infrastructure so tech teams can focus on what matters most: building great products. Designed for modern and innovative companies, Qovery delivers the ease of a PaaS with the flexibility and control of your own cloud. Qovery takes the toil out of DevOps, freeing engineers to focus on what matters, while providing a refined experience and full control at every stage of scaling.

Code coverage standards for a Next.js project using CircleCI and Coveralls

An essential part of software development, testing helps catch bugs and errors early, improves software quality, and ultimately prevents costly issues from being deployed to production. The effectiveness of software testing will remain uncertain until it can be measured and that is where code coverage comes in. Code coverage is a metric that tells developers what portion of their codebase is executed when specific tests are run.

Why Jenkins might cost you 10x more than Bitbucket Pipelines

Any company still running their own CI/CD, such as Jenkins, is paying a price. The question is which one: Each of these is extremely costly, and yet many software teams still host and run their own CI/CD for two reasons: tools like Jenkins are cheap or free, and hosting Jenkins on an AWS EC2 instance is often cheaper per minute than SaaS CI/CD services like Bitbucket Pipelines. Both reasons are true, but they’re also a trap.

Best Practices for Data Centre Migration A Risk-Aware Guide for IT Leaders

When a data centre migration is executed well, it enables growth and strengthens resilience. When it is not, the consequences are immediate: service downtime, compliance breaches, and operational disruption that affects both clients and internal teams. For IT leaders, the pressure lies in modernising infrastructure without compromising continuity.

Why Now Is The Time To Put In A 2026 Budget Request For Cost Management Software

As the Senior Manager of Finance & Accounting here at CloudZero, and with a career in FP&A that includes tenures at large public companies, I’ve spent a significant amount of time observing the interactions between the folks who plan the company’s budget and those who spend it. While engineering and operations teams focus on the execution side, my team ensures that the company has the resources required to make each endeavor a success.

Introduction to Facts in Puppet

Unlock the full potential of your Puppet infrastructure with a deep dive into Puppet Facts. This guide explains how Puppet uses facts to gather real-time data, making your code more dynamic, adaptable, and aware of the systems it manages. Discover the different types of facts, from simple structured data to secure trusted facts, and learn how to leverage them for enhanced visibility and standardization across your entire estate.

OTel Naming Best Practices for Spans, Attributes, and Metrics

An incident’s in progress. Services are slow, customers are frustrated, and your dashboards… look fine. At least, until you search for payment metrics and get 47 different names for the same signal. Suddenly, the real issue isn’t latency — it’s inconsistency. The OpenTelemetry project recently published a three-part series on naming conventions to solve exactly this problem.

Middle Mile Networks: What They Are and How to Use Them

Whether it’s streaming video, powering remote work, or supporting smart technologies, the ability to connect local users to the global internet is essential. But behind the scenes, a key infrastructure layer ensures that this digital experience runs smoothly: the middle mile. Middle mile networks serve as the critical bridge between the internet’s backbone and the local networks that deliver service to homes, businesses, and institutions.

Easiest Way to Ship Docker & Nginx Logs to Loki with Promtail

Effective monitoring catches problems before users do, and with Promtail, Loki, and LogQL, it’s a lightweight, approachable option for any DevOps team. This guide shows how to monitor Docker itself (pull failures, restarts, health flaps) so you’ve got a baseline on container runtime health.

How to Adopt Chaos Engineering

Modern systems are more complex-and more fragile-than ever before. Whether it's scaling challenges, dependency failures, or unpredictable outages, reliability is no longer optional. It's a competitive edge. This eBook provides a practical blueprint for successfully adopting Chaos Engineering, with strategies proven to work across engineering, SRE, and QA teams. Learn how to overcome internal blockers, align ownership, and embed resilience testing directly into your software delivery lifecycle.