Operations | Monitoring | ITSM | DevOps | Cloud

Blog

Database Decision-Making for Observability, from Simple to Complex

A goal of open-source observability is unifying several different signals to provide the observability everyone wants. It’s always interesting to speak to people on this journey, and how they try to provide it through open-source projects, and the challenges they can face. I was thrilled to host Pranay Prateek on the most recent episode of the OpenObservability Talks podcast.

Why Website Uptime Monitoring Is Crucial For Preventing Downtime

Website uptime monitoring is crucial for any business that depends on its website. But for companies whose whole service is online, it is essential. If your site isn't reliably serving users when they need it, your competitors are just a Google search away. So you can't just check your site is running now and then - you need a tool to check it as frequently as possible.

Online Learning: a Novel Approach to Applying Machine Learning in Splunk

Most classical, batch-oriented machine learning systems follow the paradigm of “fit and apply”. In an earlier blog post, I discussed a few patterns on how to better organize data pipelines and machine learning workflows in Splunk. In this blog, we’ll review how you can organize your machine learning model in a new way: online learning.

Distributed Tracing Observability in Microservices

Have you ever tried to find a bug in a multi-layered architecture? Although this might sound like a simple enough task, it can quickly become a nightmare if the system doesn’t have proper monitoring. And the more distributed your system is, the more complex it becomes to analyze the root cause of a problem. That’s precisely why observability is key in distributed systems. Observability can be thought of as the advanced version of application monitoring.

When Can A Service Not Be a Service?

If you’re familiar with PagerDuty, you probably associate it with alerts about technical services behaving in ways they shouldn’t. Maybe you yourself have been notified at some point that a service wasn’t available, was responding slowly, or was returning incorrect information. That’s the common use of a service in the PagerDuty platform.

Running Cloudify Github Actions Locally

Cloudify offers a set of GitHub actions that can be used to interact with your managers. You can combine and use those actions based on your needs. You can check them out in the GitHub marketplace. This brings us to the main point where a developer would require a way to test GitHub workflows or debug them locally without needing to modify the workflow on the repository -extra commits for debugging- and then go through the logs using the Github actions tab.

How Proper Organisation Skills Can Help Your Project Flourish

If you are responsible for managing a project, then organisation skills are necessary. After all, if you aren't able to organise things in your own life, how are you going to organise a project? Unfortunately, a lot of people don't have natural organisational skills. However, just because these skills don't come naturally to a lot of people, that's not to say that you can't learn them. This post will tell you what the benefits of learning organisational skills are, and how they can help your project to flourish.

Linux server monitoring: Long story short

Servers are almost inseparable from any IT infrastructure. Linux is the most compatible, open source operating system for servers because of its flexibility, consistency, and security. Most Linux servers are set up with any of these variants of Linux OS: Red Hat Enterprise Linux (RHEL), Debian, Fedora, openSUSE, CentOS, Suse Linux Enterprise Server (SLES), or Ubuntu. Basic troubleshooting of a Linux server’s primary metrics can be easily done using the built-in commands.

Track your carbon footprint with Hardware Sentry's offering in the Datadog Marketplace

As we enter a critical period in the effort to mitigate climate change, organizations are facing mounting regulatory pressure—along with a biological imperative—to reduce their carbon footprint. And for those that maintain significant on-prem infrastructure, energy costs associated with operating hardware components can significantly affect their bottom line.