Operations | Monitoring | ITSM | DevOps | Cloud

IoT in Industrial & Utility Operations - From Smart Metering to Hazardous Environment Communications

Water utilities spend billions each year on manual meter reading. Trucks roll out to every street. Workers lift concrete covers. They write down numbers by hand. The data goes into a spreadsheet days later. By then, a leak may have wasted thousands of gallons. On the other side of industry, oil rigs and chemical plants need communication gear that does not spark. A standard phone call could ignite everything. The equipment must pass strict safety standards. It must work in salt spray and extreme temperatures.

Why "Trust Your Supplier" Fails as a China Sourcing Strategy

The most expensive quality failure in China sourcing usually starts with a sentence that sounds completely reasonable: "They seem reliable." The website looks legitimate. The quote is clear, the sample works, and the salesperson answers on WhatsApp within minutes, saying all the right things about tolerances, certifications, and lead times. So the PO goes out, the deposit clears, production starts - and your operation has quietly handed control to a factory it doesn't really understand.

The API tests passed. The database didn't.

We shipped v2 of a small products API on a Thursday. Green CI. Green replay. The new search endpoint worked. I went home feeling competent. Friday morning I ran the same traffic against both builds with proxymock and compared the SQL. v2 had added 80 queries on the same HTTP script. A per-product audit COUNT was firing inside the list handler. A startup migration had run ALTER TABLE and CREATE TABLE audit_log. Total DB time was up 70 ms on a demo that should have been boring.

The entire cloud stack is just pizza and we can't unsee it

The acronym soup is the universal IT rite of passage. So we explained the whole stack with the one thing everyone already understands: pizza. From making the dough yourself to just opening the box, there’s a version for every level of “how much do you actually want to manage.” Alexis can give the full breakdown in the time it takes to decide what you'll order.

Six AI agent SDKs for enterprise Kubernetes, compared

There’s a question we hear constantly from platform and engineering leaders right now, “which agent SDK should we standardize on for our Kubernetes clusters?” The honest answer is that the question is slightly wrong, and the rest of this post explains why. But it’s a fair question, so let’s compare the contenders first.

DevOps with Kubernetes: How to Reduce Cluster Toil and Complexity

Has Kubernetes made your DevOps team faster, or just busier? Most teams adopt it for speed and portability, and they get both. What arrives with it is a quieter cost: the operational weight of running the cluster day to day. That weight shows up in the manual work the platform was supposed to eliminate. A resource limit set incorrectly can waste infrastructure for months.

Unified Observability: Moving IT Teams from Reactive to Predictive

What does it take to stop an outage before it starts? In many cases, the warning signs are already there, scattered across different monitoring tools, which makes it difficult to see the full picture before issues escalate. When an incident occurs, engineers often spend valuable time piecing together metrics, logs, traces, and alerts to determine the root cause. Every minute spent investigating extends the outage and increases its business impact.

Extending the Application Edge with F5 BIG-IP VE and Megaport Virtual Edge

Learn how F5 BIG-IP VE simplifies multicloud application delivery, security, and traffic management with MVE. As enterprise applications continue spreading across multiple clouds, the application edge is changing. A few years ago, application delivery was usually tied to a physical appliance sitting in a data center; today, applications are everywhere.

Observability for LLM Apps and Agents: OpenLIT SDK + VictoriaMetrics observability stack

Many “LLM observability with OpenTelemetry” tutorials stop at a single chat.completions span. That works for a demo, but it leaves gaps once an agent fans out into 30 tool calls, two vector-DB queries, three handoffs, and a 90-second tail latency you need to attribute. This post wires the OpenLIT SDK (50+ instrumentations, OTel GenAI semantic conventions, one line of code) into the full VictoriaMetrics observability stack and shows query examples that turn agent telemetry into decisions.