Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

How we slashed detection and resolution time in half (Salt Security)

Salt Security had deployed OpenTelemetry but found it insufficient. So the company engineers evaluated Helios, which visualizes distributed tracing for fast troubleshooting. My role as the Director of Platform Engineering at Salt Security lets me pursue my passion for cloud-native tech and for solving difficult system-design challenges. One of the recent challenges we solved had to do with visibility into our services. Or lack thereof.

Debugging and troubleshooting microservices in production-All you need to know

What do you do when things break in production? Debugging microservices isn’t a walk in the park. Microservices are designed to be loosely coupled, which makes them more scalable and resilient, but also more difficult to debug. When a problem occurs in a microservices app, it can be difficult to track down the source of the problem. When the problem is in production, the clock is ticking and you have to resolve the issues fast.

Lambda monitoring: Combining the three pillars of observability to reduce MTTR

Observability & monitoring can be challenging when it comes to distributed applications, serverless architectures being a typical examples of that. As with any other service that we run, we need to understand how our Lambda functions are executed, how to identify issues, and how to optimize performance.

How we combined OpenTelemetry traces with Prometheus metrics to build a powerful alerting mechanism

One of the qualities of engineering team excellence is thinking outside the box to find creative solutions to hard problems. It’s our responsibility, as dev leaders, to pass on to the next generations of developers tips and tricks to help them look beyond the surface to solve complex business problems and leverage the power of the open source community, when possible.

OpenTelemetry .NET Distributed Tracing - A Developer's Guide

Modern applications are becoming increasingly distributed due to a wide range of benefits including enhanced scalability, high availability, fault tolerance, and better geographical distribution. But it also makes the overall system complex making it challenging to understand how they function internally. Distributed tracing helps to address it by tracking how requests flow through various system components with detailed insights.

Serverless observability, monitoring, and debugging - Overview and best practices

Serverless, as you may already know, is a cloud computing model where the cloud provider dynamically manages and allocates resources to execute code without the need for server provisioning or infrastructure management on the developer. This article overviews serverless observability, monitoring, and debugging, based on distributed tracing and OpenTelemetry (OTel).

API monitoring vs. observability in microservices- Troubleshooting guide

Monitoring APIs through enhanced observability has gained traction with the popularity of microservices. Since microservice applications are built as independent and scalable modules, the number of microservices can grow dramatically as the application grows, increasing the complexity drastically. Since APIs work as the connective tissue between microservices, the number of APIs also grows in parallel.

Distributed tracing Node.js- OpenTelemetry-based monitoring

As the trend toward microservices-based architectures continues to gain momentum, it’s becoming increasingly clear that distributed tracing will be a crucial tool for monitoring and debugging these complex systems in the future. When designing a microservices-based architecture, breaking extensive services into smaller, more manageable components is standard practice. Communication between these components becomes crucial, but finding the root cause can be challenging when issues arise.