Operations | Monitoring | ITSM | DevOps | Cloud

November 2023

What is Cardinality? Cardinality Metrics for Monitoring and Observability

The transition to cloud-native architectures has led to an explosion in metrics data, both in volume and cardinality. This necessitates the development of monitoring systems capable of managing large-scale, high-cardinality data to achieve effective observability in these environments . In this blog post, we’ll explore the important role of cardinality in monitoring and observability.

Metrics to Monitor for AWS (ELB) Elastic Load Balancing

Amazon Elastic Load Balancing (ELB) allows websites and web services to serve more requests from users by adding more servers based on need. There are several challenges to operating load balancers, as discussed in a previous blog post: Microservices Load Balancing: Navigating the Waves of Modern Architecture. An unhealthy ELB can cause your website to go offline or slow to a crawl.

Splunk SOAR 6.2 Introduces New Automation Features, Workload Migration, and Firewall Integrations

The Splunk team is proud to announce the release of Splunk SOAR 6.2 (Security Orchestration Automation and Response). We’ve been hard at work developing the latest and greatest features for this update, several of which have come from requests and suggestions from our users over on Splunk Ideas.

What's IT Monitoring? IT Systems Monitoring Explained

Whether on the cloud or on-premises, visibility into the inner workings of our IT services and infrastructure is an essential ingredient of a well working IT system. The drive for digital transformation as a core strategic objective for most modern enterprises has meant that ensuring IT systems are working well, secured and delivering value for money is a critical endeavor.

Splunk Edge Hub: Physical Data, Sensing and Monitoring on the Edge

Splunk Edge Hub device is a multi-component solution that includes a hardware device coupled with the Splunk platform and solutions that our partners build on top of both. It is a powerful tool that can help collect, distribute and act on data from edge devices and sensors, making it easier to capture and act on data that can be difficult to access physically or digitally.

Active vs. Passive Monitoring: What's The Difference?

Today, it’s perfectly normal for businesses to continuously monitor software applications and IT infrastructure to ensure uninterrupted customer service. Active and passive monitoring are the two popular methods enterprises use for infrastructure and application performance monitoring (APM). As the names indicate, these two approaches to monitoring are very different.

Announcing the Splunk Add-on for OpenTelemetry Collector

The Splunk Add-on for OpenTelemetry Collector is a variation of the Splunk Distribution of the OpenTelemetry Collector that simplifies metrics and traces data collection, configuration and management. Since it is an add-on, users can deploy it alongside Universal Forwarders using tools like Deployment Server to start collecting high-fidelity metrics and traces from 1000s of their hosts easily. We’re happy to announce that the Add-On is now generally available in Splunkbase.

Deployment Frequency (DF) Explained

Technical teams use various metrics and indicators to track performance and success. For DevOps teams, among the most important metrics is deployment frequency. Deployment frequency can help you evaluate the software delivery performance of teams that develop software and apps. In this article, I’ll look at using this metric to calculate deployment rate, the importance and best practices for improving your deployment rate and setting your DevOps team up for success.

Infrastructure Management & Lifecycle Explained

IT infrastructure must meet enterprise needs for effective service delivery while also providing value for money. This is a critical undertaking. Massive data growth, increased complexity of hybrid cloud environments, and emphasis on digital-first strategies are just some of the challenges. This requires an advanced approach to how infrastructure is configured and controlled — infrastructure management.

IT Spending: Trends & Forecasts for 2023

Perhaps the most defining trends of the 2020s so far have been abrupt change and mixed signals. IT spending is no different. A mere 3 years ago, COVID-19 swept the globe and thought leaders were calling for the start of a “new normal” and predicted that life on Earth would never be the same – and a major component of that change would be a move to remote-first and digital-everything.

How To Investigate a Reported Problem

Getting to the root cause of a problem in cloud-native environments requires engineers to navigate through immense complexity within a distributed system. Oftentimes, you didn’t write the code and you lack the background and context to quickly understand what’s going on when a problem occurs. The stakes are even higher when a problem is reported - meaning it’s already started to impact the business and the executives and your customers are not pleased.

What is Multicloud? An Introduction

Simply defined, multicloud (or multi-cloud) describes a computing environment that relies on multiple SaaS or cloud services for different workloads within a single architecture. In a multicloud approach, organizations may use public cloud providers such as Amazon Web Services (AWS) for infrastructure, Microsoft Azure for platform, and Google Cloud Platform for development.

The Internet of Medical Things (IoMT): A Brief Introduction

The Internet of Medical Things (IoMT), a subset of Internet of Things (IoT) technologies, comprises inter-networked devices and applications used in medical and healthcare information technology applications. IoMT devices connect patients, doctors and medical devices — including hospital equipment, diagnostic gear, and wearable technology — by transmitting information over a secure network.

Distributed Tracing: Your Ultimate Guide

When all your IT systems, your apps and software, and your people are spread out, you need a way to see what’s happening in all these minute and separate interactions. That’s exactly what distributed tracing does. Distributed tracing is a way to tracking requests in applications and how those requests move from users and frontend devices through to backend services and databases.

Improvements to DSDL Container Build Process

We’re happy to announce that with the upcoming release of Splunk App for Data Science and Deep Learning (DSDL) 5.1.1 we’re significantly overhauling the build process for containers in DSDL. More and more customers are adopting DSDL for some of their most complex and advanced workloads. In this newest release, we’re making the process of deploying, building and maintaining containers for DSDL more modular, more secure, more robust, and more scalable as well as adding some new features!

The Importance of Microservices

What are microservices? Microservices are a software approach that creates applications as a loose coupling of specific services or functions, rather than as a single, “monolithic” program. A microservice architecture increases the speed and reliability with which large, complex applications are delivered. What makes a service a microservice? Microservices are defined not by how they’re coded, but by how they fit into a broader system or solution.

How to Use Tags to Speed Up Troubleshooting

Maybe as a kid, you pretended to have a magic wand. You would say something like, “Show me the answer to this long division question” then wave your magic wand and wait for the answer. Sadly, mine never seemed to work – for math questions or to make magical snacks appear. Now, imagine if you had a magic wand for your application stack where you could ask it a question about your data and it would give you immediate insights.

Observability Shifts Right

Observability first emerged as a focal point of interest in the DevOps community in the 2017 time frame. Aware that business was demanding highly adaptable digital environments, DevOps professionals realised that high adaptability required a new approach to IT architecture. Whereas historically, digital stacks were monolithic or, at best, coarsely grained, the new stacks would have to be highly modular, dynamic, ephemeral at the component level, and spread over multiple cloud-based services.

How to Quickly Find What's Broken in Your Complex, Cloud Environment

With the rapid adoption of cloud, distributed systems and microservices are standard, resulting in increasingly complex environments. Once straightforward troubleshooting workflows have become chaotic, frustrating, and time-consuming. When something breaks, multiple teams are called to the table to prove they’re “not it”; each with their singular view of the problem.

Value Stream Management: A Brief Explainer

Simply put, value stream management (VSM) is the practice of measuring and improving the flow of business value created by an organization’s software delivery efforts.By monitoring the software delivery life cycle end-to-end, organizations can better identify processes that add value and eliminate those that create waste to optimize the flow of work. Ultimately, this enables teams to move away from activities that don’t directly contribute to customer value and focus more on those that do.

Customer Data Analytics: An Introduction

Simply put, customer analytics (or customer data analytics) is the process of using information about customer preferences and behavior to improve sales, marketing and product development. You can think of customer analytics as the type of customer behavior where buyers are doing internet research before making a purchase. There is now a vast amount of information available for nearly every product category online.

Data Platforms Explained: Features, Benefits & Getting Started

A data platform is a comprehensive end-to-end solution for all your data. A true data platform can ingest, process, analyze and present data generated by all the systems and infrastructures within your organization. In this topic, there’s a lot of things to understand and consider. So, let’s take a deep look at data platforms, including the definition and related terms, the benefits and use cases, and how to start building your data strategy.

ELT: Extract Load Transform, Explained

Businesses today rely on analytics and insights derived from different data types for gaining competitive advantages. These data often come from different sources and in different formats. Without a unified solution, aggregating those data and performing analytics tasks is challenging. ELT has been invented to solve the complexities associated with processing data from multiple sources while retaining the raw data as it is.

What is AIOps? AIOps Explained

What is AIOps? Simply put, AIOps uses big data, analytics and machine learning to automate and improve IT operations (ITOps). AI is particularly important in ITOps functions such as anomaly detection and event correlation, as it has the ability to analyze large volumes of network and machine data to find patterns, identify the cause of existing problems and find ways to forecast and prevent future issues.

What Is OpenTelemetry? A Complete Introduction

What is OpenTelemetry? Simply put, OpenTelemetry is an open source observability framework. It offers vendor-agnostic or vendor-neutral APIs, software development kits (SDKs) and other tools for collecting telemetry data from cloud-native applications and their supporting infrastructure to understand their performance and health. Managing performance in today’s complex, distributed environment is extremely difficult.

Observability for Sustainability

For the past 20 years, the various stakeholder communities that together constitute the IT industry have attempted to address sustainability. The original efforts grew out of the realisation that even as far back as 2005, the hardware and software that underlay the digital world were responsible for approximately 5% of overall energy consumption and that both the percentage and absolute amounts of energy required were growing in the double digits.