Operations | Monitoring | ITSM | DevOps | Cloud

Cloud Cost Optimization Best Practices, Strategies, and Tools to Reduce Bills

As network engineers, you play a crucial role in managing cloud infrastructure that supports your organization’s applications and services. Cloud platforms offer immense flexibility and scalability, but without careful cost management, expenses can quickly spiral out of control.

Build an AI-powered Golang code review agent with CircleCI and GitHub webhooks

Code reviews are a crucial step in maintaining code quality, but many developers find them tedious and inconsistent. What if you could get helpful feedback automatically, as soon as a pull request is opened? In this tutorial, you’ll learn how to set up and integrate an AI-powered code review agent into your Go project. The agent uses the OpenAI API to post contextual suggestions and praise directly on pull requests.

Automating machine learning security checks using CI/CD

Machine learning (ML) pipelines are increasingly being treated like software; built, tested, deployed, and monitored using automated tooling. But while infrastructure as code and microservices have matured with security best practices, ML systems often lag behind. The truth is, your ML pipeline is part of your software supply chain and it is vulnerable.

What is Physical Asset Management Software?

Physical asset management software is a critical tool for modern businesses, designed to track, manage, and optimize their physical resources efficiently. From office equipment to industrial machinery, this software provides visibility into the location, condition, and usage of assets, helping organizations make informed decisions to enhance operational performance, reduce costs, and ensure compliance with regulations.

How to Set Up Co-Managed IT Access Like a Ninja

Handing the keys to multiple IT teams doesn’t have to invite RBAC chaos. In this quick-strike session we’ll show you how to carve out precise roles in NinjaOne, keep permission creep on a tight leash, and spin up a co-managed setup that’s accessible but secure — without adding a new manual headache to your week. Bring your access questions; leave with a ninja-sharp checklist.

How to reduce Cloud Costs (with Open Source!)

We strongly believe that simple observability should be an innovation everyone can afford to benefit from: which is why Coroot is open source, and includes cost monitoring for Azure, GCP, AWS, or your own custom settings. eBPF automatically tracks how each deployment impacts your cloud costs, so you can easily roll back changes and avoid lovecraftian monthly bill when necessary.

How to Set Up a Syslog Server: A Complete Step-By-Step Guide

Syslog servers are essential for centralized log management, helping network engineers monitor, troubleshoot, and secure network devices efficiently. This guide walks you through setting up a syslog server from scratch, focusing on practical steps using rsyslog on a Linux system—a common and robust choice for syslog collection. Windows does not have a native syslog server, so you need third-party software.

How to be prepared for cloud provider outages

GCP’s recent outage on June 12th was a reminder of just how interconnected modern architectures are. The 2 hour and 28 minute outage affected dozens of companies and spanned 80+ Google services and products. But what was really illuminating was just how far the outage spread due to hidden dependency risks. Many companies that don’t run on GCP were startled to find their services suddenly affected because they had dependencies or depended on vendors that did use GCP.

How to test your systems for scalability and redundancy with fault injection

Part of the Gremlin Office Hours series: A monthly deep dive with Gremlin experts. Do you know if your services can tolerate losing a node? What about an entire availability zone? Or a region? Large-scale outages aren’t unheard of. When you’re running critical services, it’s vital that those services can keep running even if an AZ or region fails. In addition to failing over, these services also need to scale quickly so traffic shifts don’t overwhelm your systems. How do you prove that a service is both scalable and redundant? The answer is with Fault Injection.