Operations | Monitoring | ITSM | DevOps | Cloud

Sponsored Post

Bunnyshell: The Best Qovery Alternative for Modern Cloud Deployment

When it comes to automating DevOps processes and managing cloud environments, both Bunnyshell and Qovery have gained significant traction. While both platforms provide multi-cloud deployment, automation, and developer-friendly interfaces, there are key differences between the two. Bunnyshell offers several advanced features that make it a standout alternative to Qovery, particularly for teams seeking more control, flexibility, and efficiency in managing their cloud environments.

Top 100 SQL Interview Questions and Answers for Every Skill Level

Imagine spending weeks grinding through SQL tutorials, practicing syntax, and refining your queries in your spare time. But when the interviewer throws a complex JOIN + subquery at you, your brain suddenly hits a 404. It happens to the best of us. However, most candidates crash and burn—not because they lack knowledge, but because they prepare like students instead of engineers. They memorize syntax, pray for predictable questions, and freeze the moment they’re asked to think, not recite.

SQL Aggregate Functions: Syntax, Use Cases, And Examples

Without SQL aggregate functions, databases would be nothing more than glorified spreadsheets. These functions—SUM(), COUNT(), AVG(), MIN(), and MAX() —do the heavy lifting in analytics. They process massive datasets, transforming scattered records into clear, usable insights that drive decision-making. But while aggregation simplifies data, it can also slow things down.

Scale Anything: How Komodor Enhances Autoscaler Capabilities

Kubernetes autoscalers like Cluster Autoscaler (CAS) and Karpenter have evolved significantly to manage the sprawling Kubernetes ecosystem, which has grown far beyond a simple container orchestration platform to include a vast array of add-ons, operators, CRDs, and third-party integrations. These autoscalers play a crucial role in ensuring K8s workloads get the resources they need, precisely when they need them, without creating excess and waste.

Tidal's Integrated Suite: The Future of Cloud Migration

Cloud migration is a rapidly growing market, projected to reach $1.3 trillion in 2025 (Gartner). Yet, 74% of migrations exceed budget and timeline expectations (IDC). Why? Because cloud migration isn’t just a technical shift—it’s a strategic business transformation requiring coordinated planning across applications, infrastructure, networks, and finances.

Enrich your existing Datadog telemetry with custom metadata using Reference Tables

As your applications scale and generate more telemetry, it becomes increasingly difficult to sift through the data and analyze it against cost, business functions, and security measures. Logs, events, and other telemetry on their own may not include enough meaningful context or readable details, leading to slower troubleshooting, inefficient business processes, and higher costs.

Introducing the "retry" step failure strategy for Bitbucket Pipelines

Recently we introduced support for Failure Strategies, which allows developers to implement more powerful logic and control flow into their workflows. Today, we are excited to announce a new step failure strategy, retry, which can auto-retry your failed steps without requiring any input from the user.

GitKraken Desktop Release 11.0: GitKraken AI

In this release, GitKraken evolves into your development co-pilot with new AI features designed to save time and reduce toil. Learn how to use AI-generated commit explanations, commit message generation, and how to customize your AI model settings with OpenAI, Anthropic, or Gemini—all included with your subscription.

Ultimate Guide to Server Hardening for Kamal

Server hardening is securing a server by reducing its surface of vulnerability. This means minimizing the number of potential entry points for attackers by disabling unnecessary services, applying patches, and enforcing strong security controls. If you use Kamal to deploy applications but you must harden your servers yourself. This is a guide on the steps to take to have a secure server. The Basics of Server Security: Patch often and restart often. Patching helps with fixing known vulnerabilities.

Securing Software Supply Chains: New Research Highlights Industry Vulnerabilities

New IDC study, co-sponsored by Canonical and Google Cloud, reveals the challenges and opportunities for organizations securing their software supply chains. Today, Canonical and Google Cloud released findings from a joint research project conducted by the International Data Corporation (IDC) that sheds light on the critical challenges organizations face in securing their software supply chains. The report, “The State of Software Supply Chains.

The SaaS Magic Number: How To Calculate And Use It

You can assess your company’s financial health using a number of SaaS metrics, depending on the type of business you are in. Among the most useful is the SaaS Magic Number. So, why is it called the SaaS Magic Number, and how do you calculate it? And why is it so important to track your SaaS Magic Number regularly?

Optimizing Kubernetes node resources: How to avoid exhaustion and improve performance

Resource exhaustion at a node remains a critical issue. However, the automation of deployment and management of containerized applications is executed relatively efficiently in Kubernetes. When a node is low on resources—as in CPU, memory, or storage—a workload may suffer from failures, degraded performance, and eviction.

Smarter Workflows. Built-In AI. Better Developer Experience.

Software development is changing fast, and here at GitKraken, we’re excited to be at the forefront. We’re entering a new era—one where AI is helping developers rather than trying to replace them. Our goal is to put that power to work for developers—embedded in your workflows, on your terms. We’ve been listening closely to feedback from our community—developers, team leads, engineering managers, and our enterprise customers.

Dapper vs. Entity Framework Core: Which One Is Right for Your .NET Project?

“Dapper is pure speed—EF Core is bloated,” or “Dapper is a nightmare—EF Core keeps things scalable.” We’ve all heard it. Both sides make a point, but neither tells the whole story. The real question? Which one will save you from a world of pain six months from now? Dapper gives you raw SQL execution and total control, but you’re on your own when managing transactions, relationships, and migrations.

Python Loguru: The Logging Cheat Code You Need in Your Life

Debugging is rarely anyone's idea of a good time. You're cruising along, building something cool, when suddenly your code breaks and you're stuck digging through console outputs that look like they were written by a robot having an existential crisis. Enter Loguru – the Python logging library that feels like it was built for humans, not machines.

Load Balancing VMware Horizon's UDP and TCP Traffic: A Guide with HAProxy

If you’ve worked with VMware Horizon (now Omnissa Horizon), you know it’s a common way for enterprise users to connect to remote desktops. But for IT engineers and DevOps teams? It’s a whole different story. Horizon’s custom protocols and complex connection requirements make load balancing a bit tricky. With its recent sale to Omnissa, the technology hasn’t changed—but neither has the headache of managing it effectively.

SUSE Rancher Prime Meets Cluster API: From theory to practice

If you’re new to Kubernetes or looking to modernize your cluster management workflows, Cluster API and SUSE Rancher Prime make it easier than ever to provision and manage clusters declaratively. This guide walks you through enabling Cluster API in SUSE Rancher Prime, deploying your first cluster and exploring advanced features like GitOps. Some helpful documentation can be found here and a few pre-requisites for this hands-on walkthrough.

A Guide To GCP Regions (And How They Affect Your Costs)

Google Cloud Platform (GCP) launched in April 2008 with Google App Engine. This developer-centric Platform as a Service (PaaS) offering allowed developers to build and host web applications on Google’s cloud infrastructure. Initially, App Engine only supported Python, but in 2009, Google added Java support, offering more programming flexibility. In 2010, GCP expanded further with Cloud Storage, its second major cloud product.

Native Signing Support In Cloudsmith Extended To Docker, Nuget, And Swift

Breaches in software artifact integrity can have severe consequences. Bad actors poison artifacts by injecting malicious code into software packages, libraries, or container images, tricking developers and users into downloading compromised artifacts. These attacks can lead to data breaches, system takeovers, and widespread supply chain disruptions. Continued artifact poisoning incidents highlight the increasing risk to software supply chains.

Charmed Kubeflow 1.10

Charmed Kubeflow 1.10 is almost here! Join the live stream with Canonical’s team to get the latest news about the changes, features and integrations available. Together with Stefano Fioravanzo, Manos Vlassis and Kimonas Sotirchos from the product engineering team, we will go live to: Talk about the features from Kubeflow 1.10 Analyze the main differences between the upstream release and Canonical’s distribution.

Rancher Live: Cloud native sustainability footprint measurement

Measuring the sustainability footprint of software - cloud native or otherwise- is not easy. Learn how CNCF's Environmental Sustainability Technical Advisory Group plans for this through the Green Reviews Working Group by joining Divya Mohan and her guest, Antonio Di Turi, on March 27th.

Your guide to cloud modernization: from basics to best practices

Being in the cloud is table stakes in 2025, and simply lifting and shifting workloads to the cloud barely scratches the surface of what's possible. Yet many engineering teams find themselves managing glorified data centers in the cloud—missing out on key capabilities like auto-scaling, serverless computing, and cloud-native security features. Cloud modernization is the true game-changer for turning legacy systems into agile, scalable architectures.

11 DevOps Metrics & How an IDP Moves the Needle

Not too long ago, engineering organizations were in a headlong rush to “innovate or die” by hiring more developers and shipping more code at breakneck speed. This expansion brought short-term velocity gains—but also created a tangle of orphaned services, inconsistent processes, and developers spending more time firefighting than innovating. Today, the reality is that you can’t just keep throwing headcount at complexity.

16 Cloud migration risks and how to solve them

Cloud infrastructure powers everything from streaming services to enterprise databases, yet many organizations still run critical workloads on legacy systems. With the cloud migration market projected to reach $806.41 billion by 2029, growing at a CAGR of 28.24%, organizations of all sizes are racing to modernize their infrastructure and applications. This shift means companies need to stay competitive in a landscape where agility and scalability are non-negotiable.

MySQL Logs: Your Guide for Database Performance

MySQL logs are basically your database's diary – they record everything happening behind the scenes. Think of them as the black box of your database operations. You've got error logs showing you when things go sideways, query logs documenting every question asked of your database, and binary logs tracking changes like they're gossip in a small town.

How to Import Data From Other Software into QuickBooks Online

Struggling with CRM QuickBooks integration? Importing data into QuickBooks Online is essential for building a robust, integrated financial workflow. Whether you’re pulling customer info from external databases, syncing financial records, or automating sales pipeline updates and data migration, it’s a developer’s bread and butter. Yet, QuickBooks Online’s native import features often fall short, with rigid file formats, row limits, and no support for complex data relationships.

How to Choose the Right Database for Data Analytics

You start a query, grab a coffee, and come back to… a report that’s still loading. We’ve all been there. When your database wasn’t designed for analytics, even basic reporting can feel painfully slow. Databases aren’t one-size-fits-all, especially when it comes to analytics. The system that works fine for storing and retrieving customer transactions or app data isn’t necessarily built to process complex queries on massive datasets.

Best Data Management Solutions: Features, Pros, and Cons

Data management tools can either be your greatest ally or your biggest headache. Effective data management is the process of controlling and organizing your data assets to ensure their quality, security, and accessibility. The right data management solution simplifies the validation, storage, and processing of your data, transforming it into a reliable resource for accurate analysis, informed decision-making, and regulatory compliance.

What is Data Integration? Definition, Types, Examples & Use Cases

Given the flood of information and emerging cloud and big data technologies, more and more companies are now adopting data integration initiatives to analyze and act on their data more effectively. For modern companies seeking to improve their strategic decision-making and competitiveness, data integration meaning isn’t so simple. What is data integration? We’ll find out in this material!

Kubernetes Alternatives: What the Latest Search Trends are Signaling

Search is the signal. If you want a glimpse into where things are headed, just take a peek at this graph of search interest for Kubernetes Alternatives from the last few years. Cycle is a direct Kubernetes alternative, and having been part of building this company since 2018, I can tell you that this graph is so much more accurate than you can ever imagine. Until late 2021, talking about not using Kubernetes was met with an almost dogmatic intransigence.

Best Practices for Ephemeral Environments: Auto-Anonymizing PostgreSQL Data for AI & Devs

AI coding and development thrive in real, production-like environments—not just sandboxes with dummy data. But here’s the challenge: real data often includes sensitive info like PII, financials, or healthcare records, all governed by strict regulations like GDPR and HIPAA. How do you balance security, compliance, and speed? In this webinar, we dive into practical solutions with a live demo featuring Bunnyshell and Xata.io.

Redgate Software recognized as a Strong Performer in the 2024 Gartner Peer Insights Voice of the Customer for Integrated Development Environment Software

Customer reviews play a key role in weighing up whether to make a purchase or not. From booking a hotel to choosing a piece of technology or software, most of us would usually check out the reviews before going ahead. With this in mind, we’re excited to announce that Redgate Software has been named a Strong Performer in the 2024 Gartner Peer Insights Voice of the Customer for Integrated Development Environment Software.

Optimizing Every Layer: From Cloud to On-Premises

As digital infrastructures become more complex, businesses need an agile, unified platform that spans traditional on-premises systems to modern cloud-native environments. At Virtana, our latest feature updates across Global View, Container Observability, and Infrastructure Observability are designed to empower you to optimize every layer of your IT ecosystem.

Open source enterprise application security remains a challenge despite greater patching efforts, IDC research reveals

The latest report from the International Data Corporation (IDC) co-sponsored by Canonical and Google Cloud indicates that 36% of organizations adopt open source to improve development velocity, and 7 in 10 organizations see open source as extremely important to run mission critical workloads. However as open source adoption grows, organizations face increasing difficulty in securing and maintaining their software supply chains.

Hands-on guide to microservices unit testing with CI/CD

As microservices architectures dominate modern application development, the ability to test, secure, and automate their deployment has become a vital skill. In this guide, you’ll learn how to: Let’s first set the stage by briefly exploring the foundational concepts of CI/CD and DevOps, which underpin the automation and agility required in development workflows.

Connect to PostgreSQL using EF Core

In this tutorial, we'll walk you through the process of creating an Entity Framework Core model and connecting a.NET console application to a MySQL database. We'll guide you through each step, from setting up your connection to retrieving data from your MySQL tables. Additionally, we'll explain how to install the necessary components, including the dotConnect for MySQL, and demonstrate how to use Entity Developer to generate your model.

Microsoft 365 Backup 2026: What It Covers, What It Costs, and Where the Gaps Are

Protecting your organization’s data within Microsoft 365 is crucial to ensure business continuity, compliance, and resilience against threats like accidental deletions, cyberattacks, and data corruption. Implementing a comprehensive backup strategy safeguards your critical information and facilitates rapid recovery when needed.

Connect to MySQL using EF Core

In this tutorial, we'll walk you through the process of creating an Entity Framework Core model and connecting a.NET console application to a MySQL database. We'll guide you through each step, from setting up your connection to retrieving data from your MySQL tables. Additionally, we'll explain how to install the necessary components, including the dotConnect for MySQL, and demonstrate how to use Entity Developer to generate your model.

#039 - Banking on Kubernetes: From Fintech Frontier to Regulatory Reality with Kasper Nissen (Dash0)

In this episode, Itiel sits down with Kasper, a Developer Advocate at Dash0. Kasper discusses his background, including his extensive experience building platforms on Kubernetes at Lunar, a Nordic challenger bank. He shares insights into the challenges and successes of using cutting-edge cloud-native technologies in a regulated banking environment. Kasper also touches upon his involvement in the CNCF community as an ambassador and co-chair for KubeCon. The discussion explores Lunar's philosophy of leveraging open-source projects and their journey of adopting Kubernetes and other cloud-native tools.

Connect to SQLite using EF Core

In this tutorial, we’ll walk you through how to create an Entity Framework Core model and connect.NET console application to SQLite database. We’ll cover how to set up your connection, install the necessary packages, and use Entity Developer to generate a model based on your SQLite database. By the end of this video, you’ll be able to retrieve data from your SQLite tables directly into your.NET console application.

Connect to Oracle using EF Core

In this tutorial, we’ll demonstrate how to create an Entity Framework Core model and connect a.NET console application to an Oracle database. We’ll guide you step by step, showing how to set up your connection, install the necessary components, and use Entity Developer to generate a model based on your Oracle database. By the end of this video, you'll have a functional setup to interact with Oracle tables directly within your.NET console application.

The Complete Guide to Runbook Automation Tools in the Era of Real-Time IT

When it comes to handling routine IT tasks, runbook automation has long played a central role. Traditionally designed to schedule and execute jobs across systems like ERP and CRM platforms, these tools were essential in an era when batch processing and time-based triggers ruled the day. But the world has changed. Modern IT environments demand real-time responsiveness, intelligent automation, and event-driven execution.

An Easy and Comprehensive Guide to Prometheus API

Monitoring is the backbone of any reliable DevOps setup. And if you’re working with monitoring, you’ve likely used Prometheus. This open-source powerhouse has redefined how we track system performance, but are you making the most of its API? Prometheus is the go-to solution for monitoring container-based environments, particularly in Kubernetes. Its pull-based model and flexible query language provide deep visibility into your systems.

21 PromQL Tricks Every Developer Should Know

So you've got Prometheus up and running, but now you're scratching your head looking at those queries. PromQL (Prometheus Query Language) looks simple on the surface, but it packs some serious power once you know how to wield it. Whether you're debugging production issues at 2 AM or building dashboards that actually tell you something useful, these PromQL tricks will upgrade your monitoring game.

The power of virtual mesh: 5 key benefits of Console Connect's CloudRouter

As technology advances, managing networks becomes increasingly complex. Enterprises need a network that is agile, resilient, and easy to scale. This is where Layer 3 virtual routers, like Console Connect’s CloudRouter, provide an ideal solution for connecting multiple network endpoints, such as clouds, data centres and devices.

The Importance of OEM Maintenance for Prolonging the Life of IT Equipment

In the fast-paced world of technology, where new developments emerge almost daily, the maintenance and upkeep of IT equipment can easily be overlooked. However, maintaining this equipment is crucial, particularly through Original Equipment Manufacturer (OEM) services, to ensure longevity and optimal performance. This article delves into the importance of OEM maintenance, highlighting how it can significantly prolong the life of your IT equipment.

Newly Updated Delphi Data Access Components With Support for RAD Studio 64-bit IDE, RAD Studio 12.3, and Lazarus 3.8

Excited for an enhanced experience with Delphi Data Access Components? The latest release is here, now supporting the most recent versions of RAD Studio and Lazarus along with a bunch of improvements and impressive new features. From now on, Delphi Data Access Components support the newly introduced RAD Studio 64-bit IDE. So, if you have switched to it or are planning to do so, you can use our components without having to worry about possible compatibility issues.
Sponsored Post

Incident Response Process: Stages, Framework & Best Practices

These days, organizations must be prepared to handle unexpected disruptions efficiently. Whether it's a cybersecurity breach, system failure, or a natural disaster, having a structured Incident Management Process is essential. The Incident Management Team plays a crucial role in swiftly identifying, assessing, and resolving incidents, minimizing downtime, and ensuring business continuity. This blog explores the stages, framework, and best practices of incident management to help businesses build a robust response system.

Newly Updated dbExpress, SecureBridge, and EntityDAC With Support for RAD Studio 64-bit IDE, RAD Studio 12.3, and Lazarus 3.8

Looking forward to another milestone in our Data Connectivity powerful product line, which includes—but is not limited to—dbExpress, SecureBridge, and EntityDAC? Here we go with another release that features support for a pioneering RAD Studio 64-bit IDE, as well as the latest versions of RAD Studio and Lazarus. This update also includes new options and highly anticipated demos!

LightMesh + AWS: Secure Subnet Discovery and Unified IPAM at Scale

LightMesh now integrates with AWS, providing automated discovery and unified management of your cloud networking resources. This powerful integration eliminates fragmented visibility across VPCs and regions, giving you real-time insights into your entire AWS infrastructure through a single, intuitive platform.

A Guide to Retiring SDH & TDM Networks

The telecommunications industry is witnessing a significant transformation as operators realize the urgent need to retire their legacy Synchronous Digital Hierarchy (SDH) and TDM networks. With the advances of packet networks, this migration is not just a necessary transition but a strategic opportunity to enhance service offerings and operational efficiencies.

5 Shortcomings of Traditional IT CMDBs and Why You Need a Data Center CMDB

Many organizations use a traditional IT CMDB for tracking IT assets, but relying solely on it can create blind spots in data center operations. A Data Center CMDB is purpose-built to address these challenges, offering deeper visibility, real-time insights, and operational efficiency. It is crucial to understand the difference between the two CMDBs and why switching to a Data Center CMDB could result in more optimization and efficiency.

Top 7 Microservices Monitoring Tools to Consider in 2025

Let's talk about keeping those microservices in check. If you're running a distributed system (and who isn't these days?), you know the drill – more services mean more potential failure points. We've got the lowdown on the best microservices monitoring tools that'll have your back in 2025.

RabbitMQ Logs: Monitoring, Troubleshooting & Configuration

If your RabbitMQ queues keep growing and you have no idea why, or if messages aren’t getting picked up like they should, logs can save you a lot of guesswork. They’re basically a detailed record of what’s happening behind the scenes. This guide breaks down where to find RabbitMQ logs, how to set them up, and what to look for when things start acting up. Consider it your go-to cheat sheet for keeping RabbitMQ running smoothly.

Ubuntu Crash Logs: Find, Fix, and Prevent System Failures

If your system keeps crashing and you have no clue why, Ubuntu’s crash logs might have the answers. Whether you’re running a production server or just trying to keep your personal setup stable, these logs tell you exactly what went wrong. Instead of sifting through endless system logs, Ubuntu gives you focused crash reports—kind of like a security camera that only records when something breaks. Let’s break down where to find these logs and how to make sense of them.

What are Query Executions? | The Tony and Tonie Show

Intermittent database problems are much easier to resolve when DBAs can pinpoint exactly when they occur and under what conditions. With Query Executions, Redgate Monitor captures execution details for individual queries, helping DBAs identify the precise circumstances that cause resource contention and performance problems.

Git GUI Showdown: Top Git Clients Compared - Best Git GUI 2025

Which Git GUI is right for you? Microsoft MVP and GitKraken Ambassador Kevin Bost compares GitHub Desktop, SourceTree, Fork, Tower, SmartGit, Sublime Merge, Git Butler, and GitKraken in this hands-on walkthrough. Find out which client fits your workflow best and why GitKraken stands out. Timestamps: 00:00 Intro Special thanks to GitKraken Ambassador @Kitokeboo.

SUSE Virtualization - Enforcing Admission Resource Integrity With Validating Admission Policy

Blog written by: Ivan Sim SUSE Virtualization – Enforcing Admission Resource Integrity With Validating Admission Policy With more enterprises using SUSE Virtualization (formerly Harvester) as the bedrock virtualization platform to host their modern cloud-native AI and edge workloads, it’s important that the platform provides seamless built-in guardrails to validate and sanitize resources admitted into the environment.

What is Argo CD?

Argo CD is a declarative continuous delivery (CD) tool for Kubernetes. Argo CD pulls Kubernetes configurations (such as manifests, Helm charts, and Kustomize overlays) from a Git repository and applies them to a Kubernetes cluster. With Argo CD, developers can automatically deploy changes to their Kubernetes environments by updating their Git repository. Argo CD continuously monitors Kubernetes deployments and ensures their state matches the configuration declared in Git.

Global View: Optimizing Every Layer with Innovative New Capabilities for On-Premise

Managing a hybrid IT environment is more complex than ever, requiring real-time visibility, automation, and intelligent cost control across cloud and on-premises infrastructure. Virtana’s latest innovations help organizations streamline operations, optimize costs, and enhance security, validating that every layer of IT—whether in the cloud or on-prem—operates at peak efficiency.

AI On A Budget: Low-Cost Strategies For Running AI In The Cloud

AI costs can spiral out of control before you know it. One day you’re building an AI feature that promises to bring in a solid chunk of revenue for the company. The next day you’re obsessing over an astronomically high cloud bill that will significantly eat into your profits — or consume them entirely. To help you solve this problem, we brought in Jeremy Daly, Director of Research (and AI cost management guru) at CloudZero.

Container Observability: Optimizing Every Layer with Innovative New Capabilities for Kubernetes & Windows

Managing containerized workloads and Windows environments requires more than just basic monitoring—it demands deep observability to prevent performance bottlenecks, optimize costs, and accelerate troubleshooting. Virtana’s latest Container Observability enhancements provide IT teams with greater control, visibility, and analytics across Kubernetes and Windows-based workloads.

Infrastructure Observability: Optimizing Every Layer with Innovative New Capabilities

Modern IT environments are complex, spanning on-premises, cloud, and hybrid infrastructures. Without deep observability at every layer, performance bottlenecks, inefficiencies, and troubleshooting challenges can drain resources and impact business outcomes. Virtana’s latest Infrastructure Observability enhancements are designed to eliminate blind spots, automate performance tuning, and simplify IT operations.

Drift Away: The Hidden Risk of Large-Scale Kubernetes Environments

Configuration drift is a silent but persistent challenge in managing Kubernetes environments at scale. Whether you’re running workloads across multiple clusters in on-premises data centers, cloud providers, or edge locations, the risk of drift increases exponentially as environments grow. According to a Komodor survey, 40% of Kubernetes users report that configuration drift negatively impacts the stability of their environments.

Easiest Way to Monitor Your Java Application Using OpenTelemetry

When you're running a Java application, the JVM is doing a ton of work behind the scenes but unless you're actively collecting its internal metrics, you're essentially flying blind. Fortunately, the JMX Prometheus Receiver paired with the JMX Java Exporter Agent offers one of the simplest and most effective ways to expose JVM performance data.

How to Mock OpenAI's APIs with Speedscale's ProxyMock

Developing APIs can often be a complex web of dependencies, external dependencies, and murky network traffic. In order to build better, developers need a certain amount of stability to test a query or feature against, and when this stability is lacking, development can get more complicated and difficult. Enter API mocking. API mocking is an approach to generating a mock service that provides dependable data for a variety of testing purposes.

Automating API Mocks in Your CI Pipeline with proxymock

When running tests in a CI/CD pipeline, relying on external APIs can introduce instability, slow down execution, and even lead to failed builds due to rate limits or API downtime. Fortunately proxymock provides a solution by capturing API interactions and running a local mock server, enabling fully isolated and repeatable tests. In this blog, we’ll demonstrate how to integrate proxymock into a GitHub Actions CI pipeline using a demo app called outerspace-go.

Can Data Centers Keep Up with AI? Hyperview CEO Breaks It Down

With the rise of AI, data centers are becoming a bigger part of our everyday lives. But this growing reliance also brings up serious concerns about energy use and the environment. On Ticker’s business news show, Hyperview CEO Jad Jebara discussed how AI is driving data center operators to rethink how they handle these energy-hungry workloads while working toward sustainability goals.

Overview of Ribbon's NPT 2714 Aggregation Router

The NPT 2714 is the latest addition to Ribbon’s NPT XDR2000 series of aggregation routers. It features an innovative, orthogonal architecture for modular, centralized routers, merging the best aspects of modular and fixed systems. This architecture combines the redundancy and I/O diversity of modular systems with the simplicity and cost-efficiency of fixed systems.

Machine learning vs AI: Key differences and how they work together

Machine learning (ML) and artificial intelligence (AI) are often used interchangeably in tech discussions, yet they represent distinct concepts with important differences. While AI refers to the broader field of creating machines capable of intelligent behavior that mimics human capabilities, machine learning is a specific subset of AI focused on developing algorithms that allow computers to learn from and make predictions based on data.

Uplink | Episode 2: Data Center Boom: Navigating the Hottest Industry in the World

How did data centers become one of the hottest industries on the planet? DataBank CEO Raul Martynek joins us to break down the explosive growth of digital infrastructure—and what it really takes to scale in the AI era. The data center industry stands at the epicenter of today's technology revolution, experiencing what industry veterans call "the best market ever seen." In this revealing conversation with DataBank's CEO, we dive deep into the forces driving unprecedented growth in digital infrastructure and why this sector has become perhaps the hottest industry globally.

Going beyond MTTx and measuring "good" incident management

Going beyond MTTx and measuring “good” incident management We’ve chatted with hundreds of engineering teams, and a pattern keeps popping up: everyone’s tracking MTTX metrics—MTTR, MTTA, MTT-whatever—but when you ask, “Cool, so what are you doing with that?” …you get blank stares. And honestly, fair enough. Time-based metrics are easy.

Observability Pipeline: An Easy-to-Follow Guide for Engineers

You've got systems spitting out more logs, metrics, and traces than you can handle. Your monitoring costs are through the roof. And somehow, when something breaks at 3 AM, you still can't find the exact data you need. Sound familiar? Welcome to the observability pipeline conversation—no jargon, no fluff.

Announcing Densify's Latest Release: Smarter Kubernetes Automation, Built for the Enterprise

To coincide with KubeCon Europe 2025, we’re excited to announce the latest release of Densify’s Kubernetes optimization engine, Kubex, which delivers full-stack resource management and seamless automation resource optimization at enterprise scale. This release delivers the advanced controls enterprises have been asking for—without sacrificing the intelligence and precision that sets Densify apart.

Zero Code Instrumentation: The Missing Link in Observability

Have you ever struggled with systems that fail to tell you what went wrong? The kind where you’re digging through logs at 2 AM while alerts keep piling up. In DevOps, clear visibility into your applications isn’t a luxury—it’s essential. This is where instrumentation without code changes can help. It simplifies observability, reducing the manual effort needed to track down issues. If you haven’t explored it yet, you might be making troubleshooting harder than it needs to be.

How DCI to the Edge will be important for AI

AI continues to dominate the networking landscape in 2025, but what impact will it have on future developments? The network edge is expected to play an increasingly significant role as AI model training in the core transitions to inference at the edge. This webinar will discuss the importance of the edge in AI advancements and explore how service providers can leverage their prime assets—space, power, and network infrastructure—to support this rapidly growing application.

How to Revolutionize Your NOC with the Resolve Capabilities Model

Network operations centers (NOCs) are under increasing pressure to improve efficiency, reduce downtime, and accelerate incident resolution. Yet many struggle to define a clear NOC automation strategy. When your team is deep in the trenches, it’s easy to lose sight of the long term: The answers lie within our framework: the Resolve Capabilities Model. This model was developed in tandem with customer conversations about NOC automation, aspirations, broader tech challenges, and more.

Protecting against Next.js middleware vulnerability CVE-2025-29927 with HAProxy

A recently discovered security vulnerability requires attention from development teams using Next.js in production environments. Let’s discuss the vulnerability and look at a practical HAProxy solution that you can implement with just a single line of configuration. These solutions are easy, safe, and incredibly fast to deploy while planning more comprehensive framework updates.

Software Trends - Cycle Looks at DevOps & Platform Engineering

Software engineering is trending, and the latest fads come and go with passionate adoption. Remember the OpenStack craze? Kubernetes? On-prem v cloud? I could go on and on, but one trend in software engineering that has surfaced in the last few years is DevOps & Platform Engineering.

CLI Tool for Monitoring for Key System Metrics - Here's How It Works!

At MetricFire, we’re always looking for ways to make monitoring more efficient and accessible. That’s why we’re excited to introduce the MetricFire HG-CLI, our new command-line tool designed to make setting up server monitoring faster and easier than ever. Just like our Hosted Graphite service, the HG-CLI is built on open-source flexibility while focusing on simplicity, eliminating the hassle of manual configurations and streamlining the onboarding process for teams of all sizes.

NPT 2714 Hardware Overview

The NPT 2714 is a high-capacity, fully redundant IP aggregation router designed by Ribbon, featuring a unique architecture that combines the modular capabilities of traditional systems with the efficiency and simplicity of fixed systems. The innovative design of the NPT 2714 enables operational continuity, allowing for flexible upgrades and expansion without service interruptions. Key features include in-service upgradeability from 7.2 to 14.4 terabits per second, nine front-accessible I/O cards, and various operational interfaces.

How Motive achieves 99.99% reliability with Rootly

In the high-stakes world of fleet management, reliability isn’t a nice-to-have—it’s a necessity. That’s why Motive has invested heavily in tools and processes to ensure its systems run smoothly for over 150,000 customers and more than a million vehicles. At the center of its ability to deliver 99.99% uptime at scale is Rootly.

Are AI and Platforms Making SRE Obsolete? With Kaspar von Grünberg, Humanitec's CEO

Last year, over 89% of companies claimed to have adopted platform engineering. And, in the past month, LLMs have been disrupting how we think about software development. In this context, Kaspar, asks if the role of Site Reliability Engineers is being obsolete as we know it. Kaspar argues that while SREs aren’t going anywhere, their responsibilities are evolving—fast. We talk about.

What is Application Security (AppSec)?

The cybersecurity world has changed. Thanks to spreading risk of cyber attacks, malware, ransomware, and the intensifying pressure of new cybersecurity regulations and sky-high penalties for leaks and breaches, robust Application Security (AppSec) is non-negotiable. In this blog, you’ll learn how you can meet these challenges head on, and secure your operations and systems by focusing on the most fundamental aspects of your security posture.

From Conflicts to Control: The Case for Virtual Clusters in Kubernetes

Managing multiple teams in Kubernetes can feel like juggling too many balls at once. Have you ever struggled with resource conflicts, security risks‌ or simply keeping everything running smoothly when everyone shares the same cluster? If so, you’re not alone. Let’s dive into how virtual clusters can transform this chaos into a well-orchestrated symphony.

7 tips for effective system prompting: A developer's guide to building better AI applications

As AI becomes increasingly central to modern software development, the ability to craft effective system prompts has emerged as a crucial skill. Whether you’re building a code generation tool, creating a chatbot, or developing AI-powered features, your success largely depends on how well you can communicate with AI models through prompts. At CircleCI, we’ve spent countless hours working with developers who are integrating AI into their applications.

What is Git Checkout Remote Branch? Benefits, Best Practices & More

Git is a terrific tool that many developers use to keep track of their projects’ versions. Despite the fact that there are many different version control systems, git is by far the most used. The focus on distributed development and the ease with which branches can be used for good reasons.

AI Cost Optimization Strategies For AI-First Organizations

Not long ago, our co-founder and CTO, Erik Peterson, shared some insights on AI spending. He shared how AI costs currently fall under the write-off-friendly world of R&D. He also acknowledged why DevOps teams might feel it’s too early to start optimizing AI costs. As the saying goes, “Premature optimization is the root of all evil.” But after more than a decade of software development, Erik knows that eventually, research, experimentation, and big ideas need to deliver real returns.

Top 5 dashboards for DevOps leaders

If you are a DevOps manager you will be keenly aware that the role involves managing multiple toolchains across different clouds, platforms and environments. You also need to report on KPIs, DORA metrics, governance, security and a lot more. At SquaredUp, we understand these demands and have developed a suite of plugins and ready-to-run dashboards to help you reduce toil as well as pull all of your key analytics together within a single pane of glass.

Concept Demo: Enhanced Multi-tasking in Mattermost

In this demo, we’re showcasing an early prototype of a new multi-window experience to up-level multi-tasking in Mattermost. We showcase how teams can quickly pivot between channels and threads—without losing context—while maintaining mission-critical communication with focus and precision. Note: This is a conceptual demo. The capability has not yet been released in Mattermost. Contact our Fast Futures team at fastfutures@mattermost.com to share your thoughts on this demo.

Why industry leaders trust Canonical & Ubuntu #ubuntu #opensource #ai #shorts

Henri Parmentier, ADLINK: "We've worked with Canonical since 2019. Ubuntu gives us peace of mind with proactive maintenance, testing, and validation, ensuring stable, secure products." Thomas Kastner, Advantech: "Ubuntu combines the power of a vast open-source community with Canonical's rigorous testing to deliver a stable, production-ready product." How has Ubuntu helped you scale your IoT solutions?

How to Monitor JVM with OpenTelemetry and MetricFire

When you're running a Java application, the JVM is doing a ton of work behind the scenes but unless you're monitoring those internals, it's hard to know how your app is really performing. JVM metrics give you a window into the heart of the runtime: how much memory you're using, how often garbage collection is kicking in, how many threads are active, and where potential bottlenecks might be hiding.

What DevOps Teams Should Know About Institutional Crypto Trading Infrastructure

As digital assets become increasingly mainstream, the demand for robust infrastructure to support crypto operations is growing rapidly. While much of the public discourse around crypto focuses on price movements and consumer trading, there's a quieter transformation happening behind the scenes-especially in how institutional players engage with crypto markets. For DevOps professionals working in fintech, finance, and cloud infrastructure, this shift carries significant implications.

Set App Time Limits for Kids (No Parent's End Install!) | AirDroid Parental Control Web Guide

Key Features Covered: Easy Setup – Pair devices remotely and customize schedules in minutes. Block Apps/Categories during specific hours or set daily screen time limits. Instant Notifications when your child requests extra usage time, approve/deny effortlessly. Say goodbye to digital addiction and hello to balanced screen time! Perfect for busy parents who want to protect their kids while respecting their independence.

How we implemented a release/promotion workflow with a single approval, using Kosli

A feature we often get asked about at Kosli is whether we can help support a release/promotion workflow: a workflow that deploys a known set of Artifacts from one runtime environment (eg beta/staging) into another runtime environment (eg production), typically in parallel. The simple answer is we can help, and in this blog we show the release workflow in the Kosli cyber-dojo demo project (an open sourced application for practising TDD from your browser).

2025 OneDrive Licensing Changes

Microsoft recently announced significant changes to its OneDrive licensing and storage policies, affecting organizations that heavily rely on cloud storage solutions. Starting January 27, 2025, unlicensed OneDrive accounts—those without assigned user licenses—will be automatically archived after 93 days, rendering them inaccessible unless covered by retention policies or legal holds.

Q&A with Waseem Aslam, Pulsant Data Centre Manager

With ten years at Pulsant, Waseem Aslam, Data Centre Manager, has climbed the ranks from engineer to leader, embracing challenges and innovation along the way. In this Q&A, he shares insights into his journey, the supportive Pulsant culture that kept him motivated, and the opportunities that have shaped his career.

Simple Talks Podcast | S2, Episode 5 - Coffee chat with Kellyn Gorman

In this Coffee Chat episode, Louis sits down with Redgate Advocate Kellyn Gorman. Kellyn tells her origin story of becoming a leader in the data community, which is as inspiring as it is interesting, and the conversation also covers security (and the inherent trust issues data professionals have!), fact-checking ourselves, aging DBAs, Lego obsession, and Raspberry Pi. Also, ever wondered why Kellyn’s social media handle is dbakevlar? Listen in to find out!

Kosli Raises $10 Million Series A led by Deutsche Bank and Heavybit to Transform Software Delivery Governance.

We are delighted to announce that Kosli has raised $10 million in Series A funding. The round was led by Deutsche Bank’s Corporate Venture Capital (CVC) group, with participation from Heavybit, Defined Capital, Transpose Platform, and a number of angel investors. Alongside this funding milestone we are launching Kosli Enterprise, a new offering designed to meet the complex governance and compliance needs of large financial institutions.

How Telco Automation Tools Expedite Service Delivery While Simplifying Network Operations

Telco automation tools are some of the most powerful instruments that telecom providers can harness. Automation empowers telcos to streamline their network operations centers (NOC), reduce costs, achieve a defragmented view of their digital ecosystem, and ultimately improve service delivery for customers. Though process automation and auto-remediation are unquestionably powerful, it’s natural for network teams to wonder which telco automation tools might be best for their specific needs and niches.

How Xero Scaled Engineering Excellence with Cortex

Xero is a global technology company offering cloud-based accounting solutions designed specifically for small businesses. With engineering teams spread across Auckland, Wellington, Melbourne, and Toronto, managing clear ownership, consistency, and effective collaboration presented significant operational challenges.

Priority-Based Escalation Policies: Because Not All Notifications Burn the Same

Let's face it – not all notifications are created equal. That paper cut of a CSS bug probably doesn't need the same response as your production database doing its best impression of a black hole. Today, we're thrilled to announce Priority-Based Escalation Policies, a powerful new way to ensure your team's response matches the notification severity.

Proactive Monitoring: How Engineers Use CloudWatch to Save Customers Money

At MetricFire, we love talking with engineers about their tech stacks, SRE challenges, and how they approach infrastructure monitoring. Recently, we had a great chat with Yoimer Roman from a Latin American cloud consulting company, that helps clients make smarter business decisions by leveraging AWS CloudWatch monitoring. Yoimer wears many hats: mentoring his team on all things AWS, designing custom cloud environments, and bridging the gap between technical challenges and non-technical stakeholders.

Accelerating AI with open source machine learning infrastructure

The landscape of artificial intelligence is rapidly evolving, demanding robust and scalable infrastructure. To meet these challenges, we’ve developed a comprehensive reference architecture (RA) that leverages the power of open-source tools and cutting-edge hardware.

Anbox Cloud 1.25.0: new features, improvements and updates

In this video, Anbox team covers new features and changes in Anbox Cloud 1.25.0 release: What is Anbox Cloud? Anbox Cloud lets you run virtualized Android environments securely, at any scale, to any device letting you focus on your use case. Run Android in system containers, not emulators, on AWS, OCI, Azure, GCP or your private cloud with ultra low streaming latency. Chapters for Easy Navigation.

Demo Roundups! Zero Trust Security + Runbook Automation

The shift to zero trust security requires a model that is identity-based, centrally managed, widely encrypted, and always authenticated and authorized. PagerDuty Runbook Automation enables users to automate, orchestrate, and accelerate issue resolution with best practice security guardrails, reducing human error and saving time. Host: Sid Verma (Senior Developer Advocate at PagerDuty) Guests: Christopher Hills (Chief Security Strategist at BeyondTrust); Jake Cohen (Senior Product Manager at PagerDuty)

Tech Debt as Innovation? How Netflix Turns It Into Opportunity

At Civo Navigate San Francisco 2025, Lisa Smith, from Netflix shares a fresh perspective on how tech debt can drive innovation instead of slowing teams down. Learn how to staff legacy systems, handle tricky deprecations, and evaluate the “tech debtiness” of your infrastructure to unlock growth and efficiency. Discover how to turn tech debt into a strategic advantage for your engineering team.

Cloud migration security: Risks, strategies, and best practices

Whether you’re migrating from on-premises to the cloud, between cloud providers, or to more advanced cloud architectures, each path shares common security challenges that must be addressed head-on. With the right approach, you can actually enhance your security posture during migration. In this article, we'll dig into practical approaches to cloud migration security, covering everything from initial planning to post-migration maintenance.

#038 - Kubernetes Supercharging Particle Physics with Ricardo Rocha (CERN)

Ricardo from CERN, who leads the platform infrastructure teams, discusses CERN's significant role in particle physics research with the Large Hadron Collider. The conversation covers how CERN manages the massive amounts of data generated from experiments using a worldwide computing grid. Ricardo shares CERN's journey with adopting Kubernetes for various applications, including critical systems controlling detectors and accelerators. He also touches upon CERN's involvement with the CNCF and the Kubernetes community.

Introducing Charmed PostgreSQL

PostgreSQL, a proven and well-loved database trusted by IT sectors for over three decades, continues to evolve with modern enterprise needs. In this video, we introduce Charmed PostgreSQL — an advanced enterprise solution designed to secure and automate the deployment, maintenance, and upgrades of PostgreSQL databases across private and public clouds. Watch the video to explore more about Charmed PostgreSQL, its features and advantages: Security and compliance features Support and managed services Automation features Deployment options Pricing.

SUSE Rancher Prime Meets Cluster API: What You Need to Know

Kubernetes has revolutionized how we deploy and manage applications, but juggling clusters across clouds and on-premises environments can quickly become a tangled mess. Different tools, inconsistent configurations‌ and manual processes drain your team’s time and energy. What if there was a way to simplify Kubernetes cluster management, bringing order to the chaos? Enter Cluster API (CAPI) and SUSE Rancher Prime.

Your Observability Questions, Answered

Monitoring used to be simple—set up some dashboards, configure alerts, and call it a day. But with microservices and cloud-native systems, things aren’t so straightforward anymore. Keeping track of everything can feel like an endless game of whack-a-mole. That’s where observability comes in. If you’re just getting started or looking to refine your approach, this guide answers the most common (and important) questions.

Conan Launches C/C++ Audit Functionality

Conan is a leading software package manager for C/C++ development environments. As an open source multi-platform package manager, it is used to create, manage and share native binaries and their dependencies based on C/C++ code. C/C++ is often the preferred language for developing embedded systems, mobile platforms, and real-time applications due to its low-level control, high performance, and direct memory management capabilities.

Comparing Major SD-WAN Vendors

In the face of increasingly complex, far-reaching network architectures, organizations continue to adopt Software-Defined Wide Area Network (SD-WAN) solutions to enhance network performance, security, and agility. Megaport Virtual Edge (MVE) offers a versatile platform that integrates with leading SD-WAN vendors, enabling businesses to deploy their virtual network functions seamlessly.

DSC and STP Update

For over 15 years, Ribbon Communications and Cellusys have worked together to drive innovation in signaling solutions. Today, we’re excited to announce the next phase of our collaboration: Cellusys and Ribbon will be combining the strength of the Cellusys Signaling Firewall and Roaming platforms with the Ribbon DSC and STP though the purchase by Cellusys of a license of the source code to the Ribbon STP and DSC products.

Announcing HAProxy ALOHA 17.0

HAProxy ALOHA 17.0 is now available, delivering powerful new features that improve UDP load balancing, simplify network management, and enhance performance. With this release, we’re introducing the new UDP Module and extending network management to the Data Plane API, a new API-based approach to network configuration. The Network Management CLI is enhanced with exit status codes and contextual help.

Rethinking WhatsApp Alerts - A Data-Driven Approach

WhatsApp has become a major alerting channel for incident response teams. It's popular and for many, a great alternative to SMS. In our 2024 recap, we mentioned how Spike sent over 25,000 alerts on WhatsApp. It is now the 2nd most used alert channel for responders on Spike (rising from 4th spot in 2023). But... I will be the first one to admit – the WhatsApp alerts experience needed work to help responders react to incidents quicker!

Proactive Monitoring: How DinoCloud Uses CloudWatch to Save Clients Money

At MetricFire, we love talking with engineers about their tech stacks, SRE challenges, and how they approach infrastructure monitoring. Recently, we had a great chat with Yoimer Roman from DinoCloud, a Latin American company that helps clients make smarter business decisions by leveraging AWS CloudWatch monitoring. Yoimer wears many hats: mentoring his team on all things AWS, designing custom cloud environments, and bridging the gap between technical challenges and non-technical stakeholders.

Using CircleCI to test and deploy Python serverless functions on Microsoft Azure

Serverless computing simplifies app development by abstracting away server management. Azure Functions provides a robust platform for event-driven, on-demand code execution. In this tutorial, we’ll create and deploy a Python-based Azure Function—one that parses incoming JSON—using CircleCI. For a more granular and enable programmatic access to Azure resources, we’ll use service principal for secure authentication and the Azure CLI orb to streamline our CI/CD pipeline.

Unlocking Edge AI: a collaborative reference architecture with NVIDIA

The world of edge AI is rapidly transforming how devices and data centers work together. Imagine healthcare tools powered by AI, or self-driving vehicles making real-time decisions. These advancements rely on bringing AI directly to edge devices. However, building a robust architecture for diverse edge environments presents significant hurdles. This blog introduces our new reference architecture, designed to simplify edge AI deployment.

Building optimized LLM chatbots with Canonical and NVIDIA

The landscape of generative AI is rapidly evolving, and building robust, scalable large language model (LLM) applications is becoming a critical need for many organizations. Canonical, in collaboration with NVIDIA, is excited to introduce a reference architecture designed to streamline and optimize the creation of powerful LLM chatbots. This solution leverages the latest NVIDIA AI technology, offering a production-ready AI pipeline built on Kubernetes.

How to cut IT complexity, stop outages, and scale without the stress

In today’s fast-paced digital world, IT managers face mounting challenges in keeping their teams efficient, their infrastructure scalable, and their applications running without disruption. Traditional hosting providers often exacerbate these challenges by operating as black boxes—offering little transparency and leaving IT teams scrambling when something goes wrong. At Upsun, we believe IT leaders should demand more from their infrastructure solutions.

PagerDuty Setup: From Beginner to Pro in 10 Steps

This comprehensive guide walks you through the complete PagerDuty setup process, organized into 10 steps. We've structured the guide to match your team's growth journey—starting with essential configurations for small teams, advancing to robust solutions for growing teams, and wrapping up with enterprise-grade features for large organizations. By the end, you'll have a fully operational incident management system set up on PagerDuty tailored to your specific needs.

Observability Reimagined: How AI is Transforming Monitoring

Observability needs to evolve. With AI reshaping IT monitoring, how can businesses leverage predictive analysis, AI-driven monitoring, and auto-remediation workflows to create more resilient infrastructures? At Civo Navigate San Francisco 2025, Jemiah Sius, New Relic, explores how AI is transforming observability, shifting from reactive responses to proactive, intelligent solutions.

How to Prove Your Network Operation Center (NOC)'s Effectiveness

If you’re a telecom provider, you already know that the network operations center (NOC) is integral to service delivery, maintaining uptime, and continuous optimization. These and other vital functions are what empower you to provide seamless service for your customers and stay one step ahead of your competitors. Your team knows all of this already, but how do you demonstrate the effectiveness of your NOC to external stakeholders and leaders?

Service Mocks #speedscale #service #mocks #software

Ken Ahrens from Speedscale break down the challenges with service mocks — from the complexity of building them to common misunderstandings about their purpose. Many think of mocks like unit test stubs, but service mocks are much more powerful. They can simulate production conditions right on your laptop, making testing more realistic and reliable.

AI Integration #speedscale #ai #integration #mcp #march

Ken Ahrens from Speescale dives into the best AI API integration model of March 2025 — Anthropic's MCP model. This innovative integration enables seamless communication with browsers and various tools, including the popular Cursor. Discover how the MCP model is revolutionizing AI-powered workflows and boosting productivity.

Breaking Down Silos: Why Security and SRE Teams Need a Unified Platform for Reliability and Risk Management

Security and Site Reliability Engineering (SRE) teams often operate as separate entities within organizations despite sharing similar goals: keeping systems secure, reliable, and performant. Security teams focus on protecting systems from threats and ensuring compliance with regulatory frameworks. SRE teams concentrate on system reliability, performance optimization, and incident management.

Log File Analysis: A Guide for DevOps Engineers

Ever found yourself buried in endless log files, trying to piece together what went wrong? For DevOps engineers, log analysis isn’t just about debugging—it’s a crucial skill for maintaining reliable systems and catching issues before they escalate. In this guide, we’ll cover everything you need to know about log file analysis, from the fundamentals to the best tools available today.

OpenTelemetry Backends: A Practical Implementation Guide

If you’ve ever found yourself sifting through logs, metrics, and traces without a clear answer to why your app crashed at 2 AM, you’re not alone. Troubleshooting without the right tools can feel like chasing shadows. That’s where the right OpenTelemetry backend makes all the difference—bringing everything together and turning scattered data into a clear picture.

Website Logging: Everything You Need to Get Started

If you're new to DevOps, you’ve likely noticed that website logging plays a bigger role than it seems at first. It’s not just a routine task—it’s how you keep systems stable, troubleshoot issues, and understand what’s happening under the hood. A good logging setup captures what went wrong, when, and why—helping you fix problems faster instead of guessing.

AI-powered robotics in action: Canonical & @qualcomm #ubuntu #opensource #iot #tech #shorts

At Canonical, we are working with Qualcomm to bring the power of Ubuntu to the latest robotics and systems: RB3 Gen2 Robotics Vision Kit – Running AI inference for body part detection Qualcomm’s new IQ9 – A high-performance system delivering 100 TOPS DSP performance By enabling Ubuntu on these systems, we simplify development and deployment, making it easier for engineers and researchers to optimize their applications across multiple hardware platforms.

Uplink | Episode 1: The Future of AI Consumption with Chris Sharp, CTO of Digital Realty

Welcome to Uplink, the podcast where digital infrastructure leaders reveal the underlying technology powering AI and cloud innovation. In this episode, our host Michael Reid sits down with Chris Sharp, CTO of Digital Realty, to discuss how enterprises are consuming AI today, the role of private interconnection, and what the future of AI workloads looks like.

Federation Done Right: Cycle's LowOps Approach

Federation allows for distributing control and services across not just multiple regions, but multiple providers and environments as well. This is a critical capability for today's multi-cloud and bare metal deployments, and the idea has gained momentum for several practical reasons such as compliance, resilience, and latency. Now, more than ever, teams are expected to support multi-cloud deployments, navigate regional compliance requirements, and deliver those low latency experiences to users globally.

Automating Flyway Desktop Development using the Flyway CLI | The Tony and Tonie Show

In this episode, Tony and Tonie discuss Tonie's article for developers who want to learn how to use the Flyway CLI to automate the database development workflow used in the Flyway Desktop GUI. Tonie talks about capturing the schema changes made to a local development database, and then use schema comparison to auto-generate and validate a Flyway versioned migration script.

AI Costs In 2025: A Guide To Pricing, Implementation, And Mistakes To Avoid

AI costs haven’t been a major factor in cloud computing — until now. For example, AI demands massive data processing and storage, such as for training Large Language Models (LLMs) and generative AI. Additionally, AI workloads require parallel processing, which traditional instances struggle to handle — forcing companies to invest in specialized (and expensive) GPUs to get the job done.

Why observability is crucial for your Kubernetes deployments: A fireside chat with ManageEngine and DevOps Toolkit

Kubernetes is at the heart of modern cloud-native applications, but achieving effective observability is no easy feat. Managing workloads, ensuring performance efficiency, and keeping costs under control demand the right strategies and tools. If you’re grappling with Kubernetes complexity, struggling with monitoring blind spots, or seeking to optimize your deployments, we have the perfect event for you.

Automating CSS code quality in front-end projects with Stylelint and CircleCI

Cascading Style Sheets (CSS) is the language used by developers to apply styles to documents written in a markup language. In front-end development, enforcing consistent CSS code quality is crucial: poorly written CSS can lead to issues ranging from poor maintainability, unexpected bugs, and inconsistent designs. One effective way to ensure CSS code quality is using a linter such as Stylelint.
Sponsored Post

What Are Cloud Development Environments?

Especially, if you have a globally distributed team, CDEs give you a smoother developer experience just by its online nature. Instead of wrestling with conflicting dependencies, trudging with inconsistent local setups, or waiting for your code to compile, you have a powerful, instantly accessible development environment in the cloud. CDEs remove typical limitations like hardware and scalability. You can quickly get started with minimal setup and configuration, but confidently move forward due to the flexibility and customization features CDEs provide.

The Rise of BYOAI: How Shadow AI is Reshaping the Workplace and the Security Risks You Can't Ignore

The Tech Show 2025, held on March 12-13, was a testament to the rapid integration of artificial intelligence (AI) across various vendors. A significant number of companies showcased their latest AI advancements, underscoring the technology’s pivotal role in shaping the future. From startups to established tech giants, exhibitors demonstrated AI’s transformative potential.

LightMesh + Azure: Real-Time IPAM and VNet Visibility in Minutes

As Azure environments scale, network complexity multiplies. Expanding subscriptions create visibility challenges that traditional approaches can’t handle. LightMesh simplifies cloud network management, delivering the comprehensive visibility and control teams need to navigate complexity with confidence.

Full-Stack Observability: What It Is [Minus the Fluff]

You've heard the term thrown around in meetups and Slack channels, but what exactly is full-stack observability? Simply put, you can see, understand, and quickly act on everything happening across your entire tech stack—from frontend user interactions to backend services, cloud infrastructure, and third-party integrations. Full-stack observability isn't just another tech buzzword. It's the difference between being blindsided by outages and catching issues before your users tweet about them.

Distributed Tracing: An Advanced Guide for DevOps & SREs

In the microservices world, tracking down performance issues feels like solving a mystery with pieces scattered across dozens of systems. When users report slowness, your team needs answers fast—not hours of guesswork. Distributed tracing is emerged as the solution, but implementing it effectively requires more than just understanding the basics. This guide takes you beyond the fundamentals to show you how DevOps teams and SREs can build truly effective tracing strategies.

AWS EFS Pricing Guide: Manage Your Storage Costs Effectively

Amazon Web Services (AWS) offers a suite of cloud storage services. Among the most widely adopted are Amazon S3 and Amazon EBS, known for their robust scalability, performance, and flexibility for a wide range of workloads. However, AWS also offers Amazon Elastic File System (EFS). This is a serverless file storage service for workloads that need shared, scalable storage for use with AWS services and on-premises resources.

10 Cloud Provisioning Tools To Drive Infrastructure Innovation

Cloud provisioning involves defining, setting up, and allocating cloud resources — such as compute power, networking, and storage — so they’re ready for use in your organization. Provisioning used to be slow and error-prone. Not anymore; it’s a streamlined, hands-off process now. But it doesn’t happen on its own. You need the right cloud provisioning tools to automate and optimize the process.

AI & ML Experts Reveal the Future - What's Next for Innovation?

Where is AI heading next? In this panel from Civo Navigate San Francisco 2025, leading AI & ML experts explore the latest advancements, challenges, and opportunities shaping the future of artificial intelligence. Join Josh Mesout (Civo), Jimil Patel (Intuit), Nami Baral (Niural), Tristian Cormier (State of California), and Gaurav Bharaj (Reality Defender) as they discuss neural networks, responsible AI governance, real-world applications, and the future of human-machine collaboration.

systemctl: The Complete Guide to Managing Linux Services

Ever found yourself staring at your terminal, wondering why a service won’t start? systemctl is the backbone of modern Linux service management, but if you’re new to it, it can feel overwhelming. This guide breaks it down—covering essential commands and advanced techniques in a clear, practical way. No unnecessary jargon, just the know-how you need to manage services with confidence.

Syslog Servers Explained: How They Help with Logging

Your team lead just dropped, "We need to set up a syslog server," and now you're wondering what you've signed up for. Syslog servers aren’t just another checkbox in your infrastructure; they’re the quiet workhorses that keep logs organized and accessible. When things go wrong, they help you connect the dots faster. Imagine this: It’s 3 AM, and alerts are flooding in. Your authentication service is failing, but the logs on that server show nothing unusual.

European Space Agency: Modernizing data and IT infrastructure to advancespace exploration with AI

The European Space Agency (ESA) leads some of the most exciting and impactful space missions, from the Billion Star Survey to Euclid’s exploration of dark matter and energy. None of these missions would be possible without infrastructure. To expand its space mission capacity, ESA embarked on a journey to modernise its IT infrastructure and adopt an integrated set of data management capabilities. Using this new infrastructure will enable the agency to reach new frontiers and adopt new AI-driven technologies, such as anomaly detection and long-term satellite health forecasting.

A Northern Ren-AI-ssance

As the UK's most geographically diverse digital infrastructure provider, Pulsant champions regional thinking. Every day, there’s a push for technological innovation to go beyond the M25 and drive the brightest businesses nationwide. This has led to our focus on the Northern Powerhouse. We have invested extensively in data centres across Manchester, Rotherham and Newcastle.

The AI Revolution is Here - Are You Ready for the Hidden Threats?

In a recent webinar, Gartner unveiled its Top 10 Strategic Technology Trends for 2025*, which all focus on the concept of ‘Responsible Innovation’. They break this down across three pivotal themes: AI Imperatives and Risks, New Frontiers of Computing, and Human-Machine Synergy.

Boost Magento performance: scale, optimize, and speed up your ecommerce store

Looking to maximize the performance of your Magento or Adobe Commerce store? In this deep-dive session, Platform.sh experts Jared Wright and Jerry’s Ida share best practices for optimizing infrastructure, caching strategies, database tuning, and deployment processes. Learn how to scale for high-traffic events like Black Friday Discover Magento-specific optimizations for caching, Redis, and MySQL Reduce downtime with efficient deployment strategies Use tools like Blackfire to track and improve performance.

FinOps As A Service: How To Do Cloud Finance For Smarter Spending

We’ve covered the fundamentals of FinOps in several guides on this blog, including FinOps 101, the FinOps maturity journey (as outlined by the FinOps Foundation), and more. But if you need a quick refresher, no worries. We’ve linked key guides on what FinOps is, why it matters, and how to make it work for you in “Related reads” in the next section.

Fix IT Incidents Faster with AI | Meet Edwin AI: The First Agentic AI for ITOps

Tired of drowning in IT alerts? Struggling to find the root cause of incidents? Edwin AI is here to help. Edwin AI is the first agentic AI built for IT teams, designed to cut through the noise, speed up resolutions, and prevent outages. Cuts alert noise by 90% – Less clutter, more focus Fixes issues 60% faster – AI-powered insights and recommendations Boosts team productivity by 20% – Automates tasks and escalations.

3 Popular Methods to Shut Down or Reboot a Remote Computer

Managing IT systems in interconnected environments often requires shutting down or rebooting remote computers for several reasons. For instance, you might want to reboot the computer to troubleshoot errors and address software updates. Or you might shut it down as part of your security protocols. In this post, you’ll learn three popular methods for rebooting or shutting down remote computers. We’ll also cover some additional considerations, including potential issues and how to solve them.

Top 5 Things to Keep in Mind When Choosing an Agentic AI Based DevOps Copilot

Looking to enhance automation, optimize your CI/CD pipeline, and improve infrastructure management all at the same time? Look no further than a Agentic AI based copilot. But choosing the right one requires a thoughtful, case-driven approach. From managing the costs associated with using large AI models to ensuring your copilot of choice actually delivers factual and reliable insights, it’s important to have all the information you need..

How Engineering Leaders Can Supercharge Developer Productivity

How do you boost developer productivity without burning out your team? At Civo Navigate San Francisco 2025, industry leaders discuss how engineering teams can improve workflows, ship quality code faster, and scale developer productivity. Join Benjie De Groot (Shipyard), Nathen Harvey (Google), Irina Nazarova (Evil Martians) and Solomon Hykes (Dagger, Docker) as they explore DORA metrics, DevEx, shift-left strategies, and tooling for high-impact engineering teams.

AI at the Edge with #ubuntu & @RenesasPresents ! #Ubuntu #Renesas #ew25 #embeddedworld #iot

At Embedded World 2025, we demonstrated how Ubuntu enables secure, scalable, and high-performance edge AI solutions. With Secure Boot, full disk encryption, and inference running directly on a CPU in kiosk mode, this setup showcases the future of embedded intelligence. How do you see AI at the edge transforming industries?

dbForge Tools for Oracle 6.0: A Myriad New Options and Enhancements to Make Your Daily Work Faster and Easier

We’ve got another big update coming your way. This time it’s all about dbForge tools for Oracle, which have received a huge number of new options, functional enhancements, and subtle design tweaks, all of which will work together to make your daily work with Oracle Database exceptionally easy and convenient. Without further ado, let’s see what we’ve got for you today.

Expand Your Connectivity With New ODBC Drivers for Dynamics 365 Business Central, Trello, and ClickUp

Our team is excited to announce the release of three new Devart ODBC Drivers that provide easy access to data stored in Dynamics 365 Business Central, Trello, and ClickUp using the ODBC standard. These Drivers are robust solutions that allow developers, analysts, and business intelligence professionals to connect to live data on cloud platforms and databases directly from any ODBC-compliant application.

Azure AI Agent Services in Azure AI Foundry

This podcast features Alex Pshul, Microsoft MVP, discussing Azure AI agents, a new technology related to Microsoft's Azure platform. The conversation covers various aspects of Azure AI, including the development and functionality of AI agents, their applications, and the underlying technologies involved. Time stamps: Introduction of Guests (00:00:00) Overview of Azure AI Agents (00:01:10) Creating AI Agents (00:03:00) Functionality and Use Cases of AI Agents (00:04:00) Future of AI Agents (00:05:00) Demo and Code Examples (00:06:00) Cautions and Recommendations (00:37:00)

How to Set Up Logging in Node.js (Without Overthinking It)

Logging in Node.js might not be the most exciting part of development, but it’s one of the most important. Whether you're troubleshooting bugs or keeping track of how your app is running, good logs make life easier. Let’s break down how to set up logging the right way.

Why your IT fails at 3 AM (and how to fix it)

In this insightful webinar, Corey Dockendorf and Michael Riley delve into the critical challenges facing IT leaders today. They tackle three main issues: the lack of visibility with traditional hosting providers, scaling infrastructure efficiently without increasing complexity, and keeping IT teams focused, productive, and well-rested. Learn how modern cloud application platforms and PaaS solutions can transform your operations, enhance transparency, and optimize performance. Discover practical insights, real-life examples, and strategies to improve your IT workflow and leverage smarter solutions for better business outcomes.

Opsgenie is shutting down. Here's what that means, and how incident.io can help

Atlassian recently announced they’ll be shutting down Opsgenie, their popular on-call alerting tool. After June 4, 2025, no new Opsgenie accounts will be created, and by April 5, 2027, the service will shut down completely. Users don’t seem happy about it. If you’re currently using Opsgenie, this news is significant. A key part of your incident response process is disappearing, and Atlassian suggests moving to their other products, like Jira Service Management or Compass.

A seven-step framework for running incident debriefs

Ever wrapped up an incident, thought 'Phew, glad that’s over,' only to feel your stomach drop when you see the dreaded "Incident Debrief" on your calendar? We've all been there. Incident debriefs don't need to feel like sitting through your least favorite school subject. They can (and should!) actually be engaging and useful. At incident.io, we've found a simple, repeatable, and blameless framework.

Is Cloud Still King? The Shifting Landscape of Infrastructure

Believe it or not, we are in the middle of one of the biggest cloud repatriation movements of the past decade. More than ever, companies are rushing to find infrastructure solutions that better suit their needs. Over the past decade, hyperscalers have dominated the market, generating trust and, in some cases, overconfidence in software development. Drawn in by promises of reliability, ease of use, and ultimate flexibility, teams turned to providers like AWS, GCP, and Azure.

Effortless observability for Django applications

Observability is critical for web operations to ensure that the application is working as expected and to identify any potential issues. However, setting up observability has traditionally been challenging because it can take hours to set up all the infrastructure, instrument your code and enable observability in production. But now there is a better way using native support for Django in Charmcraft and Rockcraft which has observability built in and ready to go!

Why Monitoring iManage is Critical for Enhancing End-User Experience in Legal Firms

As a Performance Field Technical Consultant working with customers in the legal industry, my primary focus is to ensure that technology enhances productivity rather than hinders it. Legal professionals rely on iManage as a business-critical application for document management, collaboration, and compliance. However, with the increasing shift to the cloud and integration with platforms like O365, ensuring a seamless user experience has become more complex.

How to keep track of what's running in your Gremlin team

•Part of the Gremlin Office Hours series: A monthly deep dive with Gremlin experts. Reliability testing is ongoing, and tracking that work can be difficult in large organizations. According to our own product metrics, teams run an average of 200 to 500 tests each day! With so much happening, it’s hard to keep track of everything going on—unless you use Gremlin.

Scientific Incident Management with Dan Slimmon

Dan Slimmon is an incident management veteran who's worked at Etsy, HashiCorp, and now leads consulting and training on pragmatic, non-bureaucratic incident response. In this episode, Dan shares his philosophy on "scientific incident response," the importance of hypothesis-driven troubleshooting, and why incidents should be seen as normal in complex systems.

Shaping a Greener Future: Tech Leaders on Energy & Carbon Impact

The tech industry must innovate while cutting carbon emissions and energy use. At Civo Navigate San Francisco 2025, experts in AI, cloud, and sustainability tackle the challenges of energy-hungry technologies like GPUs, software efficiency, and industry-wide solutions. Join Simon Hansford (Civo), Hui Wen Chan (Crusoe), Suleiman Mirzad (Slalom), Dr. Thomas McDonald (Orbital Materials), and Dinesh Majrekar (Civo) as they explore how tech can drive a greener future.

Preview features and upcoming changes in the next major version of Redgate Flyway

We’re excited to announce that users can opt-in/out of preview features that are under development directly in Flyway Desktop. There are currently three preview features that will be in the next major version of Flyway Desktop; a new connection dialog with options to provision databases, a full schema model view, and support for static data in PostgreSQL databases.

Puppet's Strong Performance in The Forrester Wave: Infrastructure Automation Platforms, Q4 2024 & What It Means for Your Enterprise

Technology's moving incredibly quickly. Automation has shifted from being an optional benefit to a core requirement for businesses aiming to optimize operations, increase efficiency, strengthen security, and maintain regulatory compliance across international markets. At the center of this change is infrastructure automation. To understand which vendors are leading in this area, resources like The Forrester Wave provide valuable comparative analysis.

Announcing Public Preview: Static Data for PostgreSQL Database Using Flyway

We are thrilled to announce the public preview of our new feature: Static Data for PostgreSQL Database using Flyway. This feature is designed to enhance your database management experience by allowing you to manage static data alongside your schema seamlessly for versioning and deploying your database changes.

Essential Prometheus Queries: Simple to Advanced

Monitoring your infrastructure doesn't have to be a headache. With Prometheus, you've got a powerful ally in your corner—but like any tool, knowing how to use it makes all the difference. Let's cut through the noise and get straight to the good stuff: practical Prometheus query examples that extract exactly the insights you need when you need them most.

IMAP API for Developers

In contemporary digital infrastructures, seamless email integration is a fundamental yet technically demanding requirement for researchers and developers. The IMAP facilitates structured email retrieval and management while maintaining synchronization across multiple devices. However, native IMAP implementation presents several challenges, including session persistence, authentication security, and compliance with regulatory standards.

Best MySQL GUI Tools for Linux

Linux has a reputation as an operating system for programmers. So, if you are a software developer who designs MySQL-based solutions, chances are high that you will do it on Linux. Thus, it would be great to have a MySQL IDE for Linux to simplify the work. But the question arises: is there an appropriate Linux DB tool for MySQL with a GUI?

Modernize Test Data Management with Traffic Replay

In software testing or platform engineering, having realistic data is crucial. For years, teams have relied on Test Data Management (TDM) to copy entire production databases, scrub any sensitive information, and then spin up test environments from these sanitized data sets. While TDM gets the job done, it can be costly, time-consuming, and can quickly become outdated.

What is agentic AI? The role of AI agents in DevOps automation

Agentic AI represents the next evolution in artificial intelligence systems, characterized by autonomous software entities that can independently pursue goals, make decisions, and take actions with minimal human supervision. Unlike traditional AI models that respond only to specific prompts, AI agents actively observe their environment, learn from feedback, and execute complex sequences of tasks to achieve defined objectives.

Prometheus Port Configuration: A Detailed Guide

Setting up Prometheus should be straightforward, but when metrics stop flowing, it’s usually something simple—like a port issue. Misconfigure it, and suddenly, your whole monitoring setup feels like a guessing game. This guide breaks down how to configure Prometheus ports properly, whether you're sticking to defaults or need a custom setup.

Syslog Monitoring: A Guide to Log Management and Analysis

Relying on syslogs to debug issues at odd hours? It happens to the best of us. A solid syslog setup isn’t just about collecting logs—it’s about making them useful. This guide walks through setting up syslog, configuring it for better visibility, and using monitoring techniques that actually help when things go wrong. No fluff, just practical steps you can use right away.

Performance Impact of High Cardinality in Time-Series DBs

Time-series databases have become the backbone of modern observability, financial analytics, and IoT systems. But there's a common challenge that can bring even the most robust systems to their knees: high cardinality. When your database starts tracking millions of unique values across various dimensions, performance doesn't just dip—it can collapse entirely. Let's understand the technical details of what happens when cardinality spikes and how you can architect your systems to handle it.

EMEA Rundeck by PagerDuty Meetup - March 2025

Join us for an informal 1-hour virtual event where the open-source Rundeck by PagerDuty community comes together to share automation stories and use cases. Whether you're new to Rundeck or looking to elevate your automation game, this meetup is packed with valuable takeaways for everyone! CERN Orchestrates with Rundeck.

GitKraken Desktop 101 Ep 30: Use the On-Prem Launchpad | Your Self-Hosted Workflow Hub

Your Git workflow, all in one place and fully on-prem Launchpad in GitKraken On-Premise gives you a focused view of your Git activity without relying on the cloud. See what needs your attention and jump straight into action. With Launchpad, you can.

How Netflix Engineers Launch High-Stakes Products to Millions

Launching a high-impact engineering feature that reaches millions of users and drives massive revenue is no easy feat. In this talk from Civo Navigate San Francisco 2025, Ramneet Bhatia, Senior Software Engineer at Netflix, shares the strategies, challenges, and key lessons from leading the Paid Sharing launch at Netflix.

Among the waves: Plucky Puffin

Not to be confused with the titular Linux mascot and seabird cousin, the penguin, puffins are another distinctively colorful and whimsical nautical avian. Over the centuries, these beaky birds have been given numerous nicknames and monikers like the “sea parrot” and the harlequinesque appearance of their facial feathers has earned them the distinction of “Clown of the Sea”.

Welcome to The Fire Academy: Learn FireHydrant, Your Way

Getting started with any new platform can feel like a lot. We get it. That’s why we built The Fire Academy — our new Customer Learning Platform that makes getting started on FireHydrant as seamless as possible. Our goal is simple: we want you to feel confident customizing and configuring FireHydrant to fit your needs without having to dig for answers or wait for support. Everything you need is at your fingertips, so you can work at your own pace and get the most out of the platform.

How does AI-powered computing keep a ball balanced? #canonical #intel #embeddedworld #shorts

At Embedded World 2025 at Canonical's booth 4-160, lntel is showcasing an amazing balancing ball demo, powered by. It demonstrates the power of Intel’s cutting-edge processors, combined with the flexibility and reliability of Ubuntu, handle complex physics in milliseconds, ensuring smooth, dynamic balance with precision. A perfect example of how real-time control can be applied to, automation, and industrial AI!

Incident Response: Keeping Cool When Everything's on Fire

The DevOps revolution broke down the traditional silos between development and operations, fundamentally reshaping how we build and maintain software. But with this evolution came an inevitable reality for many engineers: being on-call and responding to incidents. While critical for service reliability, the on-call experience often brings significant stress.

Announcing HAProxy Enterprise 3.1

HAProxy Enterprise 3.1 is now available! With every release, HAProxy Enterprise redefines what to expect from a software load balancer, and 3.1 is no different. With a brand new ADFSPIP Module and enhancements to the HAProxy Enterprise UDP Module, CAPTCHA Module, Global Profiling Engine, Stream Processing Offloading Engine, and Route Health Injection Module, this version improves HAProxy Enterprise's legendary performance and provides even greater flexibility and security.

Best practices for optimal infrastructure performance with Magento

Authored by Upsun Magento specialists, Stephanie Daugherty and Jared Wright. Magento is a powerful eCommerce platform that drives thousands of online stores worldwide. Its flexible, scalable, and feature-rich environment makes it a top choice for retailers across diverse industries. However, Magento’s extensive feature set and complexity can create performance bottlenecks—especially as businesses scale.

How a major retailer tested critical serverless systems with Failure Flags

Not too long ago, a customer came to us with a high-value use case. The customer, a major apparel company with retail and e-commerce applications, needed to prove that a critical service of their payment applications could failover correctly between regions in case of an outage. But there was one snag: the service was built using AWS Lambda. This meant infrastructure-focused tests would have trouble replicating the failure conditions necessary to test the failover due to Lambda’s serverless model.

Introducing Launchpad for GitKraken On-Premise: Start Your Day Informed

GitKraken On-Premise customers, we’ve got exciting news! The powerful Launchpad feature, previously available only in GitKraken Cloud, is now making its way to your self-hosted environment. This update brings a simplified, efficient, and fully local way to start your day informed—helping you stay on top of your Git activity without relying on external cloud services.

How to Mock AI APIs Using proxymock

APIs often represent the cutting edge of the technology space. This is especially true with Artificial Intelligence – as AI has evolved from speculative technology to mass adoption, it has shown up significantly in APIs as a modality and mechanism. However, as with all new technologies, using AI APIs comes with significant challenges.

Stop Merge Conflicts Before They Happen with GitKraken Desktop 10.8

Merge conflicts can derail your workflow at the worst possible moment. You’re making progress, ready to push your changes, and then suddenly—you hit a conflict. Now you have to stop, figure out what changed, track down a teammate, and resolve the issue before moving forward. It’s frustrating, time-consuming, and disrupts your momentum.

Introducing Audiences: AI That Tailors Incident Communication to Every Stakeholder

When incidents strike, clear communication is crucial — but one size doesn't fit all. Customer support needs to know what users are experiencing and possible workarounds, execs need business impact updates and timelines, and engineers need deep technical details. Manually juggling these different communication needs is time-consuming, error-prone, and frustrating when every minute counts.

The Secret Weapon for Culture Change: Product Operations Explained

At Civo Navigate San Francisco 2025, Chris Butler, Staff Product Operations Manager at GitHub, explores how product operations drives culture change and value alignment. Culture change is challenging, but product ops serves as the bridge between strategy and execution, fostering collaboration, accountability, and transparency. Drawing from his experience at GitHub, Microsoft, and Google, Chris shares insights on how organizations can sustain meaningful transformation.

How to Take Your Vulnerability Management Program to the Next Level: Automation Strategies & Tactics

A well-built vulnerability management program covers everything from detection to patching to documentation, reporting, and ongoing measurement. Taking a structured approach to vulnerability management is a differentiator for DevOps teams: The more you can automate and enforce, the less time and effort it takes to find, fix, and monitor software vulnerabilities.

How to deploy Kubeflow on Azure

Kubeflow is a cloud-native, open source machine learning operations (MLOps) platform designed for developing and deploying ML models on Kubernetes. Kubeflow helps data scientists and machine learning engineers run the entire ML lifecycle within one tool. Charmed Kubeflow is Canonical’s official distribution of Kubeflow. The key benefits of Charmed Kubeflow include security maintenance of container images, enterprise support, and further tooling integration with Spark, Feast, MLFlow, and others.

The Future of IT Operations: Why Auto-Remediation is a Game Changer

IT teams are drowning in alerts, tickets, and endless firefighting. With growing complexity across infrastructure, networks, and applications, manual incident response isn’t just inefficient—it’s unsustainable. That’s where auto-remediation steps in. Auto-remediation goes beyond detection and alerts—it takes action.

12 Best Incident Management Software for 2025

When systems fail and alerts start flooding in, having the right incident management software makes all the difference. Incident management is the process of identifying, responding to, and resolving unexpected disruptions which transforms chaos into coordinated action. Whether you're upgrading your current incident management solution or starting from scratch, we've got you covered.

PHP Error Logs: The Complete Troubleshooting Guide You Need

That moment when your PHP application runs flawlessly on your local machine but crashes in production—we've all been there. The key difference between struggling with issues and resolving them efficiently often comes down to understanding PHP error logs. This guide will help you move from trial-and-error debugging to a structured approach for identifying and fixing problems faster.

Auto Instrumentation: An In-Depth Guide

Auto instrumentation might sound like something from a music studio, but it's one of the most powerful tools in a developer's arsenal for gaining visibility into applications without tedious manual code additions. If you're tired of littering your codebase with custom traces and want a more elegant solution, you're in the right place.

Getting Started with OpenTelemetry JavaScript

Have you ever watched your JavaScript app fail in production and wondered, “What just happened?” OpenTelemetry JavaScript helps answer that question, in a practical way to track what’s going on under the hood. Let’s walk through how it works, why it’s useful, and how to set it up without unnecessary complexity. If you've ever struggled with vague logs and slow API calls, this is for you.

Product Release Notes February 2025

Last week, the FinOps Foundation released its latest installment of the State of FinOps report. This year’s theme: Cloud+. Fortunately for all of us, Cloud+ isn’t yet another streaming service to add to an already infinite list. In the FinOps Foundation’s parlance, Cloud+ refers to all the non-public-cloud spending — e.g., SaaS, AI — that you’d want to group under the broad umbrella of “cloud spending.”

Enhancing Observability with the OTEL Framework and Virtana

In today’s rapidly evolving technological landscape, observability has become essential for supporting robust, efficient systems. According to Gartner’s report “Preparing for the Future of Observability” from September 2024, OpenTelemetry (OTEL) is emerging as the standard framework for collecting telemetry data across different application pipelines.

GitKraken Desktop Release 10.8: Conflict Prevention, Multi Cherry Pick

With GitKraken Desktop 10.8 and its Conflict Prevention, you can detect conflicts with org members early, reducing messy merges and painful rework. Plus, we’ve added highly requested features like cherry picking multiple commits to make your workflow smoother than ever!

Introducing Civo FlexCore: A New Era for Private Cloud

The cloud landscape is evolving, and at Civo, we are committed to pushing boundaries and reimagining cloud infrastructure. At our recent Civo Navigate event in San Francisco, we unveiled a significant step forward in this mission—Civo FlexCore, a game-changing private cloud solution designed to bring the simplicity and efficiency of the public cloud to on-premises environments. For too long, enterprises have struggled with the limitations and costs imposed by hyperscale cloud providers.

Integrate Checkly with Render for more reliable production environments

With Render’s announcement this week of their new webhook integrations triggered by Render events, I wanted to explore how the integration between Render and Checkly can help ensure more reliable production services for your users. Render is a cloud application platform that enables developers to deploy and scale their apps without needing to manage infrastructure.

Building a serverless GenAI API with FastAPI, AWS, and CircleCI

The advancement of AI has empowered businesses to incorporate intelligent automation into their applications. A serverless Generative AI (GenAI) API enables developers to harness cutting-edge AI models without the burden of infrastructure management. This guide walks you through building a scalable and cost-effective GenAI API using FastAPI, a high-performance Python framework with built-in async support and seamless AWS integration.

Cycle Video Walkthrough: Securing Private Network Access with Cloudflare Tunnel

Cycle was built to be a powerful, security focused container orchestration platform that is a more user friendly alternative to Kubernetes. One way it achieves this is by making complex, yet secure networking easy to achieve. By combining Cycle's private networks, known as environments, with Cloudflare Tunnel, teams can further enhance their network security and reliability.

Marty Weiner's AI Predictions: What to Expect in the Next 2 Years

Recorded at Civo Navigate San Francisco 2025, Marty Weiner, co-founder of VerifyYou and former CTO of Reddit, delivers a shocking talk on the current state of AI and its rapid progression. From its impact on various industries to its potential effects on the economy and job market, Marty explores the exciting and terrifying aspects of AI. He also discusses the concept of AGI and its potential implications for civilization. Watch to learn more about the future of AI and what it means for humanity.

Software Defined Networking in the Healthcare Industry

The healthcare industry is in the midst of a technological overhaul, and COVID-19 isn’t (entirely) to blame. Telemedicine, AI diagnostics, and wearable devices that constantly beam patient data to doctors are no longer futuristic concepts – they’re here, and they’re reshaping how care is delivered. But beneath the buzzwords lies a fundamental truth: None of this works without a rock-solid digital backbone, like that provided by a software-defined network.

Cloud Architecture: Building Smart, Cost-Efficient Systems

Cloud computing offers many advantages over on-premises environments, including scalability, flexibility, and cost-efficiency. Yet, simply using a “lift and shift” strategy — moving your application as-is from an on-premise environment to the cloud with minimal modification can lead to several issues (such as inefficient design, bloated costs, etc.). If you design software or applications for the cloud, it is often best to consider making them cloud-native.

Top 12 Best Remote Access Software for Efficient Connectivity

Today, the workforce is more geographically dispersed than ever before. In the past, remote access was primarily used by IT teams or freelancers who needed to access specific resources from afar. For several years, remote work has been gaining traction, and the COVID-19 pandemic accelerated the adoption of remote and hybrid work environments. Now, businesses of all sizes rely on remote access software to empower employees, maintain productivity, and stay connected across various locations and time zones.

A Guide to Fixing Kafka Consumer Lag [Without Jargon]

Have you ever looked at your monitoring dashboard and wondered, "Why is my Kafka consumer lag spiking again?" It’s a common frustration. Consumer lag isn’t just an inconvenience—it’s a sign that something’s wrong with your data pipeline. When lag builds up, you're facing delayed data processing and the risk of system failures.

Retrieving All Keys in Redis: Commands & Best Practices

Need to list all the keys in your Redis database? If you're debugging an issue or just checking what's stored, retrieving all keys is a useful skill for any developer. This guide covers everything you need to know—from the basic commands to the performance implications—so you can query Redis efficiently without slowing things down.

High Cardinality Is Eating Your Storage Budget-Here's Why

Have you noticed your storage costs rising even when you're keeping an eye on them? The reason might be something easy to overlook: high cardinality data. For data engineers and developers balancing performance and costs, understanding its impact isn’t just useful—it’s key to avoiding unnecessary spending and system slowdowns.

Monitoring in Hyperconverged Infrastructures: Challenges and Solutions

I have a not-so-secret suspicion that the dream of everyone working with technology is the Enterprise computer from Star Trek. Controlling shields, communications, engines, and everything else from a single place—and with voice commands, no less. “One button to rule them all,” as Sauron might whisper. But until that utopia becomes a reality, at least we can implement a hyperconverged infrastructure (HCI) in our organization’s technology stack.

13 Cloud Migration Best Practices: 2025 guide

Cloud migration involves transferring digital assets, services, databases, applications, and IT resources from on-premises infrastructure to cloud environments. A well-executed cloud migration can reduce operational costs, improve scalability, enhance flexibility, and boost performance. Organizations can shift from capital-intensive infrastructure to more predictable pay-as-you-go expenses while scaling resources to match needs.

Enhancing security in Bitbucket: Introducing expiry for access tokens

As part of Atlassian’s ongoing investment in security, we’re introducing new controls to help administrators manage authentication tokens more effectively and securely. To reduce the risk of long-lived credentials becoming security vulnerabilities, all newly created access tokens in Bitbucket will require an expiry duration, as determined by the workplace admin.

Understanding enterprise application development

Enterprise application development is the process of designing, building, and maintaining large-scale software systems that support critical business functions. These applications are essential for managing operations, improving efficiency, and enabling digital transformation across industries. Unlike consumer applications, enterprise software must be highly scalable, secure, and capable of integrating with multiple systems to meet the complex needs of organizations.

Getting started with Appium for mobile testing

Mobile applications are increasingly becoming complex as they provide a wide range of functionalities, catering to diverse use cases across finance, health, entertainment, and other industries. Application developers need to ensure compatibility across a wide range of devices, operating systems, and screen sizes. It is challenging to ensure a bug-free user experience, as manually testing all the features across devices would require a lot of time and effort.

Paul Loots of Convergeone @Ribbon INSIGHTS

In this video, Paul Loots from ConvergeOne discusses the key initiatives for 2025 regarding the Edge series, highlighting the importance of the new 8500 model for enhancing port density and modernization of legacy systems. The conversation focuses on specific vertical markets that could benefit from these advancements, including alarm companies, educational institutions, and healthcare providers. Loots shares his excitement about the technological developments presented at the event and the value of in-person interactions for fostering professional relationships.

Daniel Acosta, Liberty Latin America @ Ribbon Insights

In a recent discussion with Daniel Acosta from Ribbon Insights, he elaborated on the strategic initiatives planned for 2025, focusing on transforming the legacy network into a more modern infrastructure. Over the past two years, the team has been working towards this transformation by partnering with Ron to implement the new C20 A2 G6 network technology. Acosta expressed enthusiasm about the shift toward cloud solutions, particularly in Latin America, and highlighted the significance of automation and artificial intelligence (AI) in optimizing network performance.

20 Key CFO Dashboards And KPIs: What Matters Most In 2025

Your job as a Chief Financial Officer (CFO) carries a lot of weight, regardless of the size and industry of your organization. You are also directly accountable for ensuring your company’s financial health by providing accurate, up-to-date, and actionable insights. If your revenue comes from recurring subscriptions, SaaS metrics form the backbone of your business. They measure your profitability and growth, provide financial reporting and forecasting, and provide guidance on maximizing ROI.

Introducing the Rancher CVE Portal: Enhanced Transparency and Security for Your Rancher Workloads

At SUSE, we’re always looking for ways to make it easier for customers to maintain secure, enterprise-grade environments. The Rancher Security team is excited to announce the public beta launch of the Rancher CVE Portal, available now at scans.rancher.com. This new resource is a significant step forward in providing clear, actionable visibility into vulnerabilities affecting Rancher and its associated dependencies.

Getting started with Azure DevOps dashboards

Azure DevOps and its extensive feature set helps teams plan smarter, collaborate better, and ship faster. With several integrated features such as Azure Pipelines or Azure Repos, it gives you the flexibility to use just what you need to complement your existing workflows. However, as your usage of Azure DevOps grows, you might find that monitoring and observing key CI/CD metrics across these services gets increasingly challenging.

Cloud Computing Reimagined: The Game-Changing Truth

At Civo Navigate San Francisco 2025, our experts shared their insights on how to reimagine the cloud through cutting-edge solutions and advancements. Featuring CTO Dinesh Majrekar and Chief Innovation Officer Josh Mesout, this session explores Civo's multi-cloud strategy and the pivotal role of private cloud in a hybrid future.

Amazon EKS Auto Mode

EKS Auto Mode is a huge step forward in managing your EKS clusters by automating complex tasks, enhancing cost efficiency, simplifying management, and ensuring resource optimization. These features make Kubernetes more accessible and manageable, particularly for organizations looking to leverage containerized environments without the overhead of extensive manual configuration and management.

Elasticsearch vs. Solr: What Developers Need to Know in 2025

When your project calls for a high-performance search solution, the Elasticsearch vs. Solr debate inevitably surfaces. Both are Lucene-powered search engines with passionate communities, but their architectural approaches and performance characteristics differ significantly. This guide dives into the technical nuances that matter to developers and DevOps professionals, helping you make an informed decision based on concrete metrics and real-world implementation considerations.

How to Make the Most of Redis Pipeline

If you’ve been using Redis but haven’t explored pipelining, you’re missing out on some significant performance benefits. Redis pipelining is like a hidden gem—those who know about it can’t imagine working without it. In this guide, we’ll break down why pipelining is important and how it can help improve the efficiency of your applications.

High vs Low Cardinality: Is Your Observability Stack Failing?

Imagine trying to find a friend in a packed stadium with 50,000 people versus spotting them in a quiet coffee shop. That’s the difference between high and low cardinality data. And if you’re working with distributed systems or microservices, this isn’t just a theoretical distinction—it’s a fundamental challenge that can make or break your observability setup.

Logging Best Practices to Reduce Noise and Improve Insights

Are your logs helping you, or are they just creating more work? If you’re sifting through endless data but still missing the important details, you’re not alone. It’s a common challenge—but one that can be solved. For anyone managing infrastructure, logs are essential. They show what’s happening, what’s broken, and sometimes even why. But without the right approach, they can easily turn into clutter instead of clarity.

How to Monitor Apache Zookeeper Using the OpenTelemetry Collector

Apache Zookeeper is a distributed coordination tool that helps keep large-scale systems in sync. It’s the backbone for managing leader elections, service discovery, and metadata storage in projects like Kafka, Hadoop, and Elasticsearch. Think of it as a highly available traffic controller for distributed apps, ensuring everything runs smoothly.

Drift Detection in Kubernetes

When the increasingly popular strategy of configuration as code (CaC) is used to develop infrastructure, it’s known as infrastructure as code (IaC). Today, IaC is quickly becoming entrenched in development processes, especially in conjunction with Terraform and Kubernetes. Yet, although IaC (and CaC) bring immense value, they can also lead to a major problem: configuration drift.

Managing Multi-Region Networks with Confidence

Managing cloud networks today is like navigating a city without street signs—unpredictable and chaotic. As enterprises expand their cloud presence, they face a surge in IP addresses across regions, leaving traditional methods like spreadsheets and fragmented tools unable to keep up. Modern cloud infrastructure demands smarter, more scalable network management to tackle challenges such as IP sprawl, inconsistent configurations, and lack of visibility.

Monitoring Netdata Restarts: A Journey to a Reliable and High-Performance Solution

For a tool like Netdata, monitoring crashes and abnormal events extends far beyond bug fixing—it’s essential for identifying edge cases, preventing regressions, and delivering the most dependable observability experience possible. With millions of daily downloads, each event provides a vital signal for maintaining the integrity of our systems.

Prometheus API: From Basics to Advanced Usage

Monitoring your infrastructure shouldn’t be a shot in the dark. The Prometheus API helps you pull the right metrics so you actually know what’s going on. Whether you’re just getting started or trying to make sense of your current setup, this guide breaks down how to use the API to get the answers you need—without the guesswork.

Implementing DevOps practices in front-end development

CI/CD is an approach to software development that combines continuous integration (CI) and continuous delivery (CD). CI/CD makes developing apps faster, safer, and more efficient. In front-end development, CI/CD becomes even more valuable as developers frequently test, build, and deploy applications. Automating this process reduces the need for developers to manually carry out these tasks, boosting efficiency and reliability.

Nginx Logging: A Complete Guide for Beginners

So, you're wrestling with Nginx logs, huh? Been there. In fact, I used to spend way too much time hunting down log files until I finally got smart about it. Let me save you the trouble. Nginx logs are like the black box flight recorder for your web server. When everything crashes and burns (and it will), those logs are often the only evidence left to figure out what happened. But first, you need to know where to find them.

[Webinar] Kubernetes Health Management with Komodor

Modern Kubernetes environments are increasingly growing in scale and complexity. While application performance monitoring and infra-observability tools were once sufficient to maintain reliability, they are ill-equipped to handle the distributed and ever-changing nature of dozens or hundreds of Kubernetes clusters.

InfoBlox NetMRI is Ending-Here's Why You Should Move to IP Fabric Now

If you are a network owner, you know the importance of stability, visibility, and automation. With NetMRI reaching its Last Order Date on April 30, 2025, now is the time to think ahead and choose a solution that doesn’t just replace what you have—but actually makes your job easier. That’s where IP Fabric comes in. If you’re still relying on NetMRI for network configuration and change management (NCCM), I strongly recommend making the switch now. Here’s why.

How AI broke serverless and what to do about it with Vercel's Mariano Fernández Cocirio

Mariano, Staff Product Manager at Vercel, explains why serverless architectures are hitting unexpected limits—they’re too fast. The industry has spent millions optimizing serverless for speed, but AI workloads are changing the game. In the AI realm, slower execution often leads to better results. The challenge? Paying for all that idle compute time while waiting for AI responses.

Google Authd broker: authenticate to Ubuntu Desktop/Server with your Google account

Today we are announcing the introduction of Authd support for Google IAM, allowing all Ubuntu users to use their Google account to authenticate to their desktop and servers. The Google broker snap for Authd is available for free on Ubuntu Desktop and Server 24.04 and it works with both personal and Workspace Google accounts.

AWS Direct Connect Gateway (DGW) Data Transfer Outbound Rules

Get more efficient and secure AWS DGW connectivity by understanding the provider's Data Transfer Outbound requirements. As hybrid and cloud-native architectures have become commonplace, efficient and secure connectivity between on-premises data centers and the cloud is more crucial than ever. For organizations using AWS, connecting through a Direct Connect link simplifies and centralizes network connections across multiple regions thanks to its Direct Connect Gateway (DGW) component.

Will DevOps as We Know It Survive the AI Revolution?

Is DevOps on the brink of extinction? Solomon Hykes, co-founder of Docker and CEO of Dagger.io, explores how AI agents are transforming software development—not just writing code but shipping it. In this talk at Civo Navigate San Francisco 2025, Solomon retraces the history of software’s industrial revolution and examines whether AI will replace DevOps engineers or empower them. With live demos and expert insights, he reveals what’s next for the software factory and the future of platform engineering.

Why Bad Workflows Are Silently Killing Your Velocity (and How to Fix It)

Velocity isn’t just about how fast developers write code—it’s about how efficiently that code moves from idea to production. Yet many teams struggle with an invisible bottleneck: inconsistent workflows. Without clear, standardized processes, teams hit avoidable roadblocks that slow them down in ways that aren’t always obvious. PRs linger in review, merge conflicts become routine, and onboarding a new developer feels like an unwritten scavenger hunt. The worst part?

MetricFire's CLI Tool: Easy Monitoring & Automation!

Looking for a powerful way to send and visualise metrics from the command line? Meet HG CLI, MetricFire’s official command-line tool! In this video, we’ll show you how to install, configure, and use HG CLI to manage your Hosted Graphite metrics and create dashboards, all without having to configure an agent yourself. Whether you're a DevOps engineer, SRE, or developer, this tool will streamline your monitoring workflows! Don't forget to like, subscribe, and hit the bell for more MetricFire insights!

3 Neat Tricks and New Patterns Used in Cycle - from Cycle's Customer Success Team

In the last few months I've had the privilege to implement and witness new and exciting Cycle platform workflows emerging. Some of these come from the evolution of our own internal work, while others I've learned from our users. Reflecting on this, I thought it would be beneficial to the community to share! In this article, I'll walk through a handful of the neatest patterns we've seen in action. So I hope you'll join me for a no fluff, real, and practical look at how to get more out of Cycle.

Simulating artificial intelligence service outages with Gremlin

The AI (artificial intelligence) landscape wouldn’t be where it is today without AI-as-a-service (AIaaS) providers like OpenAI, AWS, and Google Cloud. These companies have made running AI models as easy as clicking a button. As a result, more applications have been able to use AI services for data analysis, content generation, media production, and much more.
Featured Post

Personal resilience boosts operational resilience

Winter is a grinding time. The temperature, the darkness and the rain all take a toll on people. As a business, it's worth remembering that the human element of IT operations needs looking after just as much as the technology they maintain. Business leaders can't have one without the other.

New in SSIS Data Flow Components 3.1: Optimized Performance & Expanded API Support

We are thrilled to announce a major release of our SSIS Data Flow Components, powerful tools designed to simplify the ETL process within SQL Server Integration Services (SSIS) packages. This update brings significant enhancements, including improved performance, support for new server versions and APIs, and the addition of new objects and properties to existing components.

A Field Operations Guide for Telecom

Telecommunications (telecom) is the foundation of modern connectivity, enabling the transmission of voice, data, and multimedia across vast distances. Telecom companies build, operate, and maintain extensive infrastructure that supports everything from mobile networks and broadband internet to enterprise communication systems. This infrastructure includes fiber-optic cables, cell towers, satellite systems, and data centers that work together to deliver seamless connectivity.

Top Cloud Cost News From February 2025

There’s no doubt about it: The cloud cost landscape changes quickly. It seems like every month brings new updates from the cloud providers and several new resources you can use to master FinOps within your organization. Here are the top cloud cost news headlines you may have missed from February 2025: Read on for the details.

Advanced Container Resource Monitoring with docker stats

If you’ve ever needed to check how much CPU or memory a Docker container is using, docker stats is the command for the job. It provides real-time resource usage metrics, helping you monitor and troubleshoot containers efficiently. This guide covers everything you need to know about docker stats: how to use it, what each metric means, and how to integrate it into a larger monitoring setup.

#037 - Problem First, Kubernetes Second: Insights from Ahmed Bebars (New York Times | CNCF)

In this episode of Kubernetes for Humans, we speak with Ahmed Bebars, a Principal Engineer at the New York Times and a CNCF Ambassador, who offers a unique perspective on cloud native technologies. Ahmed recounts his professional journey in accounting before transitioning into the technology sector, leading to his current deep involvement with the Kubernetes ecosystem. He shares his initial introduction to Kubernetes almost a decade ago, recognizing its capabilities in container orchestration.

Forget the Hype-Kelsey Hightower on What's Next in AI & Cloud Computing

Kelsey Hightower takes the stage at Civo Navigate San Francisco 2025 to explore the evolving world of AI, cloud computing, and Kubernetes. In this insightful discussion, he shares his thoughts on data sovereignty, AI adoption, security challenges, and the future of cloud technology. Recently joining Civo as a Board Director, Kelsey discusses his vision for reimagining cloud computing, the impact of AI agents, and what developers need to focus on in an increasingly AI-driven world.

Lightning Talk: How Policy-as-Code Experts Tackle Infrastructure Governance

As cloud infrastructure scales, governance, security, and compliance become more complex. Policy-as-Code provides a powerful solution by automating and enforcing policies consistently across Infrastructure-as-Code (IaC). Join Omry, CTO of env0, and Anders, Lead Developer Advocate at Styra, for a 30-minute live lightning talk as they explore the role of policies in IaC, real-world enforcement examples, and the latest updates in Open Policy Agent (OPA). Learn why Rego is the preferred policy language, the challenges of maintaining policy frameworks, and how env0 simplifies governance and control.

How Policy-as-Code Enhances Infrastructure Governance with Open Policy Agent (OPA)

As cloud infrastructure scales, governance, security, and compliance become more complex. Policy-as-Code provides a powerful solution by automating and enforcing policies consistently across Infrastructure-as-Code (IaC). Join Omry, CTO of env0, and Andres, Lead Developer Advocate at Styra, for a 30-minute lightning talk exploring the role of policies in IaC, real-world enforcement examples, and the latest updates in Open Policy Agent (OPA).

SharePoint vs. OneDrive vs. Teams

Microsoft 365 offers multiple file storage solutions—SharePoint, OneDrive, and Teams—each designed for different use cases. However, many organizations struggle to determine where to store files and how to manage document collaboration efficiently. Choosing the wrong storage location can lead to content sprawl, security risks, and version control issues.

The State Of FinOps 2025: Cloud+, AI Visibility, And Other Key Takeaways

The FinOps Foundation just released the sixth installment of its annual State of FinOps report. The 2025 installment contains a few key themes: In this blog, we give a little more detail on each, highlighting key statistics and takeaways from the State of FinOps 2025.

Is your #observability always one step behind?

Guess what: It is designed to be like that! And the only way for you to get ahead of your operational challenges is to think differently. With Netdata, you get high-fidelity, ultra-detailed insights with unmatched granularity and cardinality and instant root cause analysis. See your infrastructure like never before! Get X-Ray Vision for your infrastructure!

Everything You Need to Know About SIEM Logs

That moment when your production system goes down, and you're stuck piecing together logs from twenty different services? It’s frustrating and slow—especially when you need answers fast. SIEM logs help bring order to this chaos, giving you a structured way to track security events and system activity. But understanding how to use them effectively isn’t always straightforward, and most documentation can feel more complicated than the problem itself.

Getting Started with the Grafana API: Practical Use Cases

Building dashboards one by one in Grafana can quickly become tedious. Clicking through the UI for every change isn’t exactly efficient. There’s a better way. The Grafana API lets you automate repetitive tasks and extend Grafana’s capabilities beyond the UI. If you're new to monitoring or managing a complex observability setup, understanding the API can make your workflow more efficient and scalable.

Python Logging Exceptions: The Setup Guide You Actually Need

Debugging a Python app can be frustrating, especially when an unexpected crash leaves behind nothing but a vague error message. A well-configured exception log can make all the difference, turning guesswork into clear insights. Here’s how to set up logging that actually helps.

Get to Know JFrog ML

AI/ML development is getting a lot of attention as organizations rush to bring AI services into their business applications. While emerging MLOps practices are designed to make developing AI applications easier, the complexity and fragmentation of available MLOps tools often complicates the work of Data Scientists and ML Engineers, and lessens trust in what’s being delivered.

Introducing Audit Logs: Ensuring Visibility, Security, and Compliance in FireHydrant

When something goes wrong, the first question is always: what changed? Whether it’s an unexpected change to your on-call schedule, a broken automation, or a modified Runbook that just seems off, understanding the issue starts with knowing who made what change, when it happened, and what exactly changed. But in an organization with many users, keeping track of every action can feel impossible.

AIOps for Kubernetes (or KAIOps?)

With the growing complexity of cloud-native applications, DevOps teams often face challenges when setting up and maintaining Kubernetes observability. AIOps (artificial intelligence for IT operations) makes the process more manageable using AI and machine learning for monitoring, troubleshooting, and performance optimization. In this article, you’ll learn about the common challenges in Kubernetes observability and how AIOps can provide proactive and effective solutions.

Building a customer churn detection system with Hugging Face and CircleCI

Losing a customer to a competitor can be costly; customer retention is vital for business success and growth. Businesses must anticipate when and why a customer might leave, so they can implement measures to retain them. One solution might be to build a system that predicts churn. But can it be done? Using machine learning (ML) techniques to analyze customer service interactions can provide valuable insight into customer sentiment.

Automating API security tests in CI/CD for Java applications

API security testing is software testing performed on APIs. It is meant to identify vulnerabilities in API endpoint communication and access. In modern software development, API security is a crucial aspect that cannot be ignored. API security testing can now be automated in CI/CD, enabling early detection of vulnerabilities, maintaining security standards without slowing down development, and reducing human errors.

Install FreePBX and Asterisk on Ubuntu 24.04 LTS for security patches until 2036

Deploying FreePBX and Asterisk on a single Ubuntu virtual machine in a public cloud is an ideal solution for personal users and small to medium-sized businesses with voice over IP (VoIP) and fax over IP (FoIP) needs. This setup costs nothing, is scalable and secure, and has daily recovery points with a recovery time measured in minutes.

Understanding Cloud Infrastructure: A Beginner's Guide

Cloud infrastructure essentially consists of a set of virtual tools and resources that help deliver cloud-based services and products. Cloud infrastructure frees companies from building their own physical data centers. Instead, they rent computing capacity on a need-by-need basis. But let’s not get ahead of ourselves. In this post, we’ll define cloud infrastructure and examine cloud computing types, delivery models, and more.

A Conversation with Steve Wozniak: Apple, Innovation, and Beyond

Get an inside look at the life and career of Steve Wozniak, co-founder of Apple, as he joins Mark Boost and Dinesh Majrekar on stage at Civo Navigate San Francisco 2025. From his early days as a young engineer to the creation of the Apple II, Wozniak shares his unique perspective on the tech industry and his role in shaping its future. With a career spanning decades, Wozniak reflects on his experiences as a pioneer in the tech industry, and offers valuable insights on how to drive innovation and stay ahead of the curve.

Squadcast Joins Forces with SolarWinds: Powering the Future of Reliability and Incident Response

We are thrilled to announce that Squadcast is now a part of SolarWinds, marking a transformative milestone in our journey to redefine reliability and incident management. When we started Squadcast, our singular mission was clear–to help teams achieve greater reliability by transforming incident response into a proactive, automated, and intelligent process. Today, that mission takes a massive leap forward as we join forces with SolarWinds, a global leader in hybrid IT observability.

EC2 Monitoring: A Practical Guide for AWS Engineers

Monitoring your EC2 instances shouldn’t be complicated or exhausting. Yet, too often, engineers find themselves troubleshooting issues in the middle of the night, searching for the root cause of an unexpected failure. Whether you're managing a few instances or hundreds spread across multiple regions, effective EC2 monitoring helps you stay ahead of problems instead of constantly reacting to them. And if you've ever dealt with a critical alert at an inconvenient hour, you know how important that is.

Nginx Error Logs: Troubleshooting and Security Guide

Nginx error logs can be tough to decipher, even for experienced sysadmins and DevOps engineers. They hold valuable clues about what’s going wrong, but sorting through them can feel overwhelming. Understanding these logs doesn’t have to be a challenge. This guide breaks them down in a clear, practical way—so you can find the issues that matter and fix them with confidence.

How to Use journalctl --last to Check Recent System Logs

When your Linux server starts acting up at 3 AM, you don't need a philosophy lesson—you need answers. Fast. That's where journalctl last comes in, the command-line equivalent of having a time machine for your system's events. If you've been piecing together log information like some digital detective with a cork board and string, it's time to upgrade your toolkit. Let's cut through the noise and get you the intel you need, when you need it.

How to debug an Android application using Anbox Cloud?

In this video, the Anbox team demonstrates how to debug an Android application with Android studio in Anbox Cloud. What is Anbox Cloud? Anbox Cloud lets you run virtualized Android environments securely, at any scale, to any device letting you focus on your use case. Run Android in system containers, not emulators, on AWS, OCI, Azure, GCP or your private cloud with ultra low streaming latency. Trademark notice Android is a trademark of Google LLC. Anbox Cloud uses assets available through the Android Open Source Project.

Why engineering teams are moving from PagerDuty to incident.io On-Call

Recently, we hosted a webinar on migrating from PagerDuty, where we explored why so many engineering teams are rethinking their on-call tools. This blog post is based on that conversation, diving into the frustrations teams face with PagerDuty and how incident.io On-Call offers a better way forward.

Signals Turns One! A Year of Growth and Innovation

A year ago, we launched Signals with a simple but powerful idea: on-call shouldn’t be a painful juggling act. Too often, teams had to bounce between separate alerting and incident response tools, slowing everything down when speed mattered most. And traditional on-call tools? They were built around services, not the people responding to them.

What's new with AWS for 2025

Amazon Web Services (AWS) holds onto the top spot with a 30% share of the global cloud infrastructure market. Despite fierce competition, AWS remains the leader by consistently driving innovation in AI and cloud computing. In this blog, we explore the latest advancements, from global infrastructure expansion to enhancements in cloud services’ availability and performance, as well as what AWS has planned for 2025.

Cloud Costs Out of Control? Civo Has the Answer!

Join Mark Boost as he kicks off Civo Navigate San Francisco 2025, setting the stage for two days of tech innovation and insights. Discover Civo’s mission to create a simpler, fairer, and better-value cloud solution, challenging the dominance of hyperscalers. Mark unveils his vision for a true multi-cloud future and introduces Civo’s latest innovations: Flexcore and RelaxAI. He then welcomes Kelsey Hightower to the stage for a major announcement—his new role as Civo’s Board Director—before diving into the crucial role of private cloud solutions in today’s evolving landscape.

Monitor OracleDB EX with OpenTelemetry and MetricFire

OracleDB remains a top choice as a relational database management system (RDBMS), despite its strict licensing requirements. It excels at handling complex SQL queries, massive datasets, and transactional workloads, making it ideal for large Enterprise technology stacks. Its many benefits include robust indexing, partitioning, and in-memory processing to optimize query performance at scale.

Azure Tagging: A Comprehensive Guide for Technophiles

Introduction: Businesses and enterprises with complex settings and backgrounds may find Azure resource management uneasy. Resource tags in Azure help manage environments competently. They improve the visibility and governance of cloud resources by organizing, tracking, and optimizing them. This post may scrutinize Azure tags and find ways to maximize the benefits of resource management.

Complexity Can Be Chaos

Monitoring is integral to understanding what is happening in your infrastructure, applications, or other observability projects. However, a common predicament developers can land themselves in is their observability stack becoming unwieldy and unmanageable due to a lack of streamlining and/or over-complicated code. To simplify your workload, it is important to streamline your monitoring.

Putting Your Data to Work to Protect Your Software Supply Chain Final

In today’s complex software ecosystem, ensuring security and reliability is more challenging than ever. Dependency trees are growing deeper, third-party contributions are increasing, and the risks - from vulnerabilities and misconfigurations to malicious attacks - are at an all-time high. Organizations must find ways to secure their software supply chains without compromising agility.

7 Snowflake Alternatives To Cut Your Cloud Costs

The Snowflake data cloud provides storage, reporting, and analytics for organizations that rely on data to run their day-to-day operations. Since its launch from stealth mode in 2014, Snowflake has become a top data warehouse solution for its manageability, superior scalability, always-on data security, advanced analytics, and robust accessibility. But Snowflake isn’t perfect.

How Cloud Computing Powers the Modern Internet

The way we use the internet today is vastly different from what it was just a decade ago. From high-speed video streaming and social media to cloud gaming and e-commerce, everything happens in real-time. But have you ever wondered what makes all of this possible? The answer lies in cloud computing-the backbone of modern digital services. Cloud computing provides the power, speed, and scalability needed to keep the internet running smoothly. It enables businesses to store massive amounts of data, process information instantly, and deliver online services without interruptions.

Building a chatbot with Dialogflow and CircleCI

Chatbots are becoming essential to software applications, enhancing user engagement through automated conversations. Deploying a chatbot to a cloud platform requires integrating multiple technologies, ensuring smooth communication between services, and automating updates efficiently. In this tutorial, you will learn how to deploy a Python-based conversational chatbot to an Azure Functions app.