Operations | Monitoring | ITSM | DevOps | Cloud

How to Configure Docker's Shared Memory Size (/dev/shm)

Your Node.js app runs fine on your machine. But inside Docker? You start getting weird crashes—ENOSPC: no space left on device. Chrome headless tests fail out of nowhere. PostgreSQL throws shared memory errors under load. The problem? It’s probably /dev/shm, the shared memory volume Docker sets up by default. Most containers get just 64MB of space here.

Amazon SQS Metrics: Monitor, Debug, and Optimize Your Message Queues

Message queues quietly take care of a lot—buffering workloads, smoothing traffic spikes, and keeping services connected. But they don’t always get much attention until something feels off. Amazon SQS offers a solid set of metrics to help you understand how your queues are doing, whether you’re scaling well or nearing limits. This blog breaks down the key SQS metrics: where to find them, what they mean, and how to respond when things start to shift.

Introducing Cause Analysis: Instant Triage for Traffic Changes with Kentik AI

Introducing Cause Analysis from Kentik, designed to simplify network traffic analysis and rapidly identify the root cause of issues. Learn how this exciting new feature streamlines troubleshooting, makes complex insights accessible, and boosts team efficiency for all users.

Understanding APM and Distributed Tracing in the Observability Stack

To keep modern applications running smoothly, you need more than just basic monitoring. APM (Application Performance Monitoring) gives you a broad overview, tracking metrics like latency, errors, and system health. Distributed Tracing, on the other hand, shows the full journey of each request across services, helping you pinpoint the root cause of slowdowns or failures.

How to Reduce IT Costs on Hardware Refresh Cycles

IT budgets are under pressure, and hardware refresh costs continue to climb. For End User Computing (EUC) and IT professionals, the traditional time-based approach to managing device lifecycles is no longer viable. Simply replacing laptops and desktops every three to five years doesn’t reflect actual device performance, usage patterns, or business needs. The solution? A smarter, data-driven hardware refresh strategy that balances performance, cost-efficiency, and employee experience.

What Is Session Replay and How It Improves User Experience in IT Environments

Anyone who works in technology quickly learns this truth: users will always interact with systems in the most unexpected and baffling ways… and when something goes wrong, they swear they “didn’t touch anything.” There’s a vast ocean between how something is designed and how it’s actually used—an ocean filled with bugs waiting to be caught. But there’s a way to bridge that gap: session replay.

How to Create a Free Status Page in Under 5 Minutes

Your website goes down at 2 AM. Your customers wake up to broken services, flooded support inboxes, and zero communication from your team. By the time you're awake and fixing things, trust is already damaged. A status page prevents this nightmare scenario. But here's the thing — most teams keep putting it off because they think it's complicated, expensive, or time-consuming. It's not. You can create a professional status page in under 5 minutes, completely free. I'll show you exactly how.

Top 10 Network Monitoring Tools to Boost Your IT Performance

In today's digital scene, a strong and secure network forms the foundation of any organization. When networks go down, face performance issues, or encounter security risks, companies can suffer big money losses and damage to their reputation. IT teams need network monitoring tools to stay on top of performance, spot problems, and keep things running. As AI, cloud-based answers, and automation get better, 2025 brings a bunch of powerful tools to make your IT setup work better.

Zero-effort alert migration from Prometheus to Coralogix

Having spent two decades in technical leadership, I’ve seen first hand what separates great development teams from merely good ones. It’s not about the number of features shipped or the elegance of the codebase — it’s about the ability to consistently deliver value to the customer through really great user experience.

Coralogix adds OTel-based service dependency tracking for distributed systems

Coralogix has released its APM Dependencies feature. This feature automatically surfaces and maps the relationships within and between your software and external services. It allows fine grained tracking of which endpoints within your APIs, depend on other endpoints, or external services and database tables.