August 2021

5 reasons why your startup needs website monitoring

Despite the ongoing pandemic, 2020 was still a record year for venture capital investments into American startups, which amounted to $156.2 billion, according to a recent PitchBook report. Being an astute entrepreneur, you’ve likely thought of everything to enable your startup business to hit the ground running — branding, product logos, hiring staff, equipment purchases, and structuring your business roadmap. You are ready to take the marketplace by storm — or so you think.

Looking beyond observability: A preview of Lightstep Incident Response for Site Reliability Engineering (SRE) and DevOps teams

Since founding Lightstep in 2015, we've been focused on one thing: providing clarity and confidence to the teams that build and operate the software that powers our daily lives. Joining ServiceNow helped us accelerate this vision, and now, we're excited to share more about what we've been working on together — Lightstep® Incident Response.

Monitor feature releases with Statsig's offering in the Datadog Marketplace

Statsig is a modern experimentation platform that provides crucial insight into how new features are received by your users, so you can make informed product decisions and deploy with confidence. Statsig automatically runs A/B tests on features as they’re rolled out, and measures their impact on key business metrics, such as user growth and engagement.

Everything you need to know about OpenTelemetry Collector

OpenTelemetry Collector is a crucial component of OpenTelemetry architecture. It reduces overhead on your application to collect and manage telemetry data. Let's do a deep dive on OpenTelemetry Collectors to understand how it works. The first step in setting up distributed systems monitoring and tracing is instrumentation, which enables generating and managing telemetry data. Once the telemetry data is generated, you need a way to collect and analyze it.

How-To Series: Tips And Tricks For Catchpoint's Integrations And APIs

Collaboration tools like Slack and Teams are here to stay. They’re very much inseparable from the distributed workforce that we all continue to find ourselves in. A robust set of integrations are then an essential part of today's monitoring and observability platforms. Feeding Catchpoint data into your support team via a Slack channel could be the difference between catching a disruption early or having to respond to a full-blown outage.

An Easy-to-Follow Guide on Migrating Your Website With Minimal Downtime

Businesses and successful social enterprises might need to move their website to a new host or server for various reasons. Perhaps you've expanded beyond what your current provider can offer or need a faster server or host. No matter the motivation, making website migrations more efficient is essential to keeping your business operations running. Although technological advancements have improved the delays that result from transferring sites, there can still be a lengthy delay.

Log Management for the MEAN Stack Framework

MEAN is evolving as a popular web stack for developing cloud native applications because of its scalability, ease of extension, and high reliability. Each component in MEAN is built on JavaScript, contributing to a cohesive development platform. In this post, we take you through the log management options that are available for each component of the MEAN stack framework and their respective limitations – limitations that are addressable with a refined log management solution like observIQ.

Indexing Strategies for SQL Server Performance

One of the easiest ways to increase query performance in SQL Server is to make sure it can quickly access the requested data as efficiently as possible. In SQL Server, using one or more indexes can be exactly the fix you need. In fact, indexes are so important, SQL Server can warn you when it figures out there’s an index missing that would benefit a query.

Following the Money: 3 Transaction Pathways to Monitor

If all you have is the beginning and the end, you’re left with a short, boring story: “Once upon a time, it was UP…then, it was DOWN.” Knowing the twists and turns of your transaction pathways is not only illuminating, but profitable. Information channels dry up when all you have are pieces.

What Is Cloud-based Software Testing and How Can It Enhance Testing Services?

Due to the sudden upsurge in the usage of software applications around the globe, enterprises are finding it extremely difficult to meet the time to market demands. Enterprise QA teams that are able to detect errors at the earliest will have more time to work on various other development phases as well as enhance the application quality. With the advent of cloud computing technology, enterprises have leveraged several innovative opportunities in software testing and software deployment.

What's new in Sysdig - August 2021

Welcome to another monthly update on what’s new from Sysdig! This month’s big announcement is our new support for Prometheus as a managed service. There are several individual features behind this which we cover in more detail below, but here is a summary: Also, Kubernetes 1.22 was released and we shared our review of what to look out for. Go check out our Kubernetes 1.22 – What’s new? post if you haven’t already.

How to Do Simple UX Monitoring With ipMonitor

Learn how you can leverage ipMonitor user experience monitors to be sure you know about any user experience issues before end users do. Do you know what’s going on right now with all the network devices, servers, and applications that are the magic behind your business? To keep on top of what’s happening with all of those moving parts, you need an easy-to-use, reliable monitoring solution that tells you what’s up, what’s down, and what’s not performing as expected.

Comparing 7 New Relic Competitors in 2021

Application performance monitoring tools, or APMs, help give developers feedback so they can understand whether their programs are working the way they had planned for their users and clients. It also provides information about the software’s quality. Most DevOps teams use these tools throughout the software development life cycle. This way, they make sure that they cover their grounds before releasing software into the market.

Understanding Apigee API Monitoring

Want to make sure the APIs you’ve launched on Apigee are performing as expected? In this video, we show how API Monitoring provides real-time insights into API traffic and performance, so you can solve problems as they happen. Watch to learn how you can stay informed and understand unusual events or patterns.

Effortlessly connect SCOM to teams and Slack with Connection Center Webinar

See how easy it is to connect SCOM to Teams and Slack with Cookdown Connection Center in this webinar recording originally aired 26-Aug-2021. The Teams and Slack integrations demoed here are part of Cookdown Connection Center, your one-stop-shop for all your integration needs to and from SCOM without writing a line of code. Connection Center lets you raise Alerts in SCOM from anywhere and push alerts to ITSM platforms, notifications tools, and more.

Firewall... It's getting hot in here!

There’s something that humans and machines have in common, and no, it’s not the disappointment suffered by the final season of Game of Thrones, or, well, at least not only that. What we have in common is that we need protection. You know, animals need it too, and plants, but if you’ve gotten this far, it seems that you’re interested in computers, networks and all these pretty modern “geek” things, so today we’ll talk about that kind of protection.

Testing Your HAProxy Configuration

Learn how to test your HAProxy Configuration. Properly testing your HAProxy configuration file is a simple, yet crucial part of administering your load balancer. Remembering to run one simple command after making a change to your configuration file can save you from unintentionally stopping your load balancer and bringing down your services.

The new world of hybrid work in Australia and New Zealand

Nathalie Tousignant, director of ITSM product management, co-wrote this blog. For decades, work was a place we went, where colleagues gathered in person for meetings, problem-solving, and chats over coffee. Then, the COVID-19 pandemic hit, and the way we thought about work changed overnight. Many tried to fit the processes of traditional in-person work into a virtual setting. The result?

Server Management: What It Means and How to Do It Right

The age of storing data only on paper is long gone. In this age, almost everything has become digital. Businesses, services, data sharing—everything has gone online. And servers play a major role in making this possible. Though it might look simple to users, a lot happens on the back end. Servers have a wide range of applications—application hosting, email management, proxies, file transfers, etc. Different servers specialize in different services.

How Istio, Tempo, and Loki speed up debugging for microservices

“How am I supposed to debug this?" Just imagine: Late Friday, you are about to shut down your laptop and … an issue comes up. Warnings, alerts, red colors. Everything that we, developers, hate the most. The architect decided to develop that system based on microservices. Hundreds of them! You, as a developer, think why? Why does the architect hate me so much? And then, the main question of the moment: How am I supposed to debug this?

Code Review

Code review is a process to ensure that bugs and errors are caught and fixed before they reach production. This very often requires the participation of developers who are not directly involved in implementing the particular part of code that is being reviewed. Code review is part of a bigger quality assurance process to ensure that the final product performs exactly as expected.

Hitchhiker's guide to Prometheus (Part 2)

In Part II (Part I is here) of our “Hitchhiker’s Guide to Prometheus,” we are going to continue with the overview of this powerful monitoring solution for cloud-native applications. In particular, we’ll walk you through configuring Prometheus for scraping exporter metrics and custom application metrics, using the Prometheus remote write API, and discuss some best practices for operating Prometheus in production. Let’s get started!

Situation Room: On-Call Team Faces Worst Case of Sunday Scaries

Picture this: it’s Sunday night. You’re relaxing in bed, in that sweet spot where you’re geared up for Monday, but the fun of the weekend hasn’t yet faded. As you idly scroll through content on your phone, you see a message preview pop up. It’s to your work email. That’s bad. It’s from the hosting company you contract. That’s really bad. They’re saying they accidentally deleted the production database.

Feature Spotlight: API Dependency Graph

As APIs become more complex, it can be easy to accidentally break connections. To prevent this, Speedscale can automatically detect and make you aware of inbound and outbound transactions running through APIs in our API Dependency Graph. The Traffic Viewer dashboard makes this information visible to independent teams working on different services. In most organizations this information is usually only known by senior engineers, team leads, and architects.

How to Structure an IT Help Desk

Managed service providers (MSPs) need an IT help desk to address and answer the technical questions of clients. In the modern MSP environment, the IT help desk is the primary source of contact between customers and knowledgeable, responsive support personnel. Successful help desks are customer oriented and encourage clients to report IT incidents when they occur.

Bolster OT Security with Graylog

Anyone tracking the evolution of the IT industry is probably familiar with the concept of Industry 4.0. Essentially, it describes the process by which traditional industrial tasks become both digitized and continually managed in an IT-like fashion via modern technologies like cloud computing, digital twins, Internet of Things (IoT) sensorization, and artificial intelligence/machine learning.

Sumo Logic Red Hat Marketplace Operator

Red Hat OpenShift is an open source container application platform that incorporates a collection of software that enables developers the ability to run an entire Kubernetes environment. It includes streamlined workflows to help teams get to production faster and is tested with dozens of technologies while providing a robust tightly-integrated platform supported over a 9-year lifecycle.

The Top 5 Node.js Performance Measurement Metrics

Using Node.js as a JavasScript runtime has its advantages. However, it requires significant maintenance to keep it working as expected. Here are the top metrics you should monitor for Node.js performance measurement analysis. Application programming interfaces or APIs that use the Node.js runtime environment are scalable. Node.js is asynchronous and event-driven, which means the application can handle multiple connections at the same time.

Monthly Moo Update | September 2021

This has been quite the summer to remember as we continue to witness our customers achieve remarkable efficiencies through automation such as deep integrations with change pipelines to suppress alerts during maintenance windows and correlating alerts to create incidents with dynamic and evolving descriptions that dramatically improve Incident management processes.

Rushing Back to the Office or Do You Prefer Remote Work?

Just as vaccines came into the picture and some hope in this pandemic, new coronavirus cases started to emerge. In San Francisco, for instance, where mask mandates were lifted, and life was starting to feel more normal again, it feels as though we are now regressing. People are struggling to cope. It feels as though we’ve just gone from one variant with high transmissibility back down to another even worse than before.

Instana AutoTrace: Fully Embracing OpenTelemetry

In my last blog post, I talked about the history and rapid adoption of the OpenTelemetry project. Today we discuss the progress made in terms of end-to-end compatibility between distributed tracing with OpenTelemetry and Instana AutoTrace, and how we are embracing OpenTelemetry as a first-class citizen in Instana.

Civo update - August 2021

Welcome to the Civo update for August 2021. It's was a busy month, with the big news being the launch of Civo Academy: A full Kubernetes learning program consisting of over 50 videos created in-house by the team here at Civo. We also kicked off the Civo DevOps Bootcamp! The first few live stream installments have been a huge success, helping developers at any stage of their career learn more about DevOps fundamentals.

Calico integration with WireGuard using kOps

It has been a while since I have been excited to write about encrypted tunnels. It might be the sheer pain of troubleshooting old technologies, or countless hours of falling down the rabbit hole of a project’s source code, that always motivated me to pursue a better alternative (without much luck). However, I believe luck is finally on my side.

Secure your clients and prevent churn with a canary

Many people are familiar with the stories of coal miners using canaries to detect carbon monoxide and other toxic gases as a warning system for when they should evacuate. Even though cybersecurity is far removed from coal mining, it has an equivalent “canary in the coal mine” that takes the form of indicators of compromise, or IoC for short. So why should an MSP be concerned with looking for IoCs?

Why you should be using a VPN when working from home

With so many of us working from home full time for the last 16 months, VPNs have become essential tools for companies to keep their staff working in a safe environment. What we mean by “safe” is mainly about your online presence whilst performing daily tasks for your job.

4 Benefits of Integrating Service Desk with Endpoint Management System

A service desk is the focal point of an IT organization to render services, and the quality of its services determines the perception of being a valuable part of the organization. The ongoing transition of businesses to adopt cloud infrastructure has forced IT organizations to modernize their service desks, which include vendors adopting cloud capability and smart automation powered by AI.

What's in a Name? "Network Specialist" vs. "Network Engineer"

The meanings behind job titles can be an elusive thing in that they might only make sense to the people actually in the roles. Take networking jobs. It’s pretty common for people to think some titles can be used interchangeably, and that depending on where you work, a job may have a different name. In some cases, even IT professionals believe that the network specialist vs engineer are jobs are really the same responsibilities.

Serverless observability and real-time debugging with Dashbird

Systems run into problems all the time. To keep things running smoothly, we need to have an error monitoring and logging system to help us discover and resolve whatever issue that may arise as soon as possible. The bigger the system the more challenging it becomes to monitor it and pinpoint the issue. And with serverless systems with 100s of services running concurrently, monitoring and troubleshooting are even more challenging tasks.

Integration Infrastructure Management is the New Black

Information Technology (IT) has moved through many clearly defined eras. I don’t feel the need to list them all, but the latest one clearly is the move from discrete computers to virtualized software-configured environments, that we all colloquially call “the cloud”. And with each technological generation, we have improved the methods we use to integrate discrete modules of technology, and increased the level of automation.

What Does Everbridge Crisis Management Do for Your Organization?

Everbridge Crisis Management provides organizations a single solution for business continuity, disaster recovery and emergency communication. In one application, crisis teams can coordinate all response activities, teams and resources to accelerate recovery times and maintain command and control when crises evolve into unanticipated scenarios.

Facilitate DevOps Monitoring with Synthetics and RUM

DevOps is a common name in the technology household. Teams, small or big, are embracing this concept to deliver applications faster, improve software quality, and add efficiency in the development process from the very beginning. Shortening the feedback loop leads to a cost-effective way for businesses to find and fix defects earlier in the cycle process. Plus, it lowers the software failure rate in production and minimizes time wastage for the development team.

The Impact of Internet of Things (IoT) in the Retail Industry

Internet Of Things has always played a significant role in the retail industry, stocking and warehousing. And its value is projected to rise from USD 14.5 billion in 2020 to USD 35.5 billion by 2025, at a CAGR of 19.6% The term Internet of Things was coined by Kevin Ashton when he was faced with a challenge in logistics and supply chain management while he co-founded the MIT Auto-CAD centre.

Robotic Data Automation (RDA): Reducing Costs and Improving Efficiencies of Your Log Management Investment

People’s involvement has been inevitable with log management despite advancements in ITOps. Log management at a high level collects and indexes all your application and system log files so that you can search through them quickly. It also lets you define rules based on log patterns so that you can get alerts when an anomaly occurs. Log management analytics solution leveraging RDA has been able to detect anomalies and aid predictive models over a machine learning layer.

Model-driven observability: Taming alert storms

In the first post of this series, we covered the general idea and benefits of model-driven observability with Juju. In the second post, we dived into the Juju topology and its benefits with respect to entity stability and metrics continuity. In this post, we discuss how the Juju topology enables grouping and management of alerts, helps prevent alert storms, and how that relates with SRE practices.

Cloud PaaS through the lens of open source - opinion

Open source software, as the name suggests, is developed in the open. The software can be freely inspected by anyone, and can be freely patched as required to suit the security requirements of the organisation running it. Any publicly identified security issues are centrally triaged and tracked.

How To Leave Work At 5 PM: Visibility, Event Management & Automation

As organizations manage increasingly interdependent network infrastructure in an increasingly chaotic world, how can you, as a Network Operations professional, maintain control of your network without losing control of your time? The answers are: network visibility, flexible event management, and powerful automation. All of this is possible within Opmantek’s network management platform.

The Fast & The Foolproof: Automation & Observability For DevOps

When software teams are charged with delivering higher quality software, faster - how do you effectively enable collaboration and observability while eliminating risk and manual processes? In this webinar, Ali Sardar from JFrog and Rob Jahn from Dynatrace will address how to overcome these challenges and unlock speed, observability, and automation across your DevOps lifecycle. In addition to best practices shared by our speakers, you will also see both products in action - meeting the critical needs of development and operations teams.

Query your nginx/envoy/syslog logs easier and way faster with the new Grafana Loki pattern parser.

Loki 2.3 introduces the pattern parser. Patterns are way simpler to write than Regex. As an added bonus, it's an order of magnitudes faster than the Loki regex parser. This means that you can now query way more semi-structured logs (nginx/envoy/syslog and more) in less time than before.

Introducing the Honeycomb plugin for Grafana

Over the years, we’ve heard many versions of the same familiar story: large businesses struggling with observability data living in several different systems. At Grafana Labs, our “big tent” philosophy is based on the belief that our users should determine their own observability strategy and choose their own tools. Grafana allows them to bring together and understand all their data, no matter where it lives.

Hitchhiker's guide to Prometheus (Part 1)

Monitoring is a crucial component of modern distributed applications. It helps administrators stay up-to-date with the current state and performance of their applications, as well as with their application infrastructure and environments. Integrating a monitoring pipeline is a major requirement for cloud-native applications that run in complex and dynamic clusters with a lot of moving parts affecting application availability and performance.

10 Kubernetes Architecture Best Practices You Should Be Following

Looking to optimize your Kubernetes architecture? While the word “Kubernetes” translates to “helmsman” (i.e., someone who steers a ship), Kubernetes ultimately functions more like an orchestra conductor than a ship captain. Kubernetes (also known as K8s) simplifies the process of orchestrating containers for engineers . This frees engineering up to focus on innovation, reduce time-to-market, and optimize cloud spend.

PagerDuty Integration Spotlight: Teleport

Just-in-time System Access and Role Escalation. Teleport provides secure access for cloud applications and infrastructure that doesn’t get in the way. When implementing strict zero-trust rules you sometimes need to escalate and elevate privileges. By leveraging PagerDuty, you are able to alert the request and approve or deny system access. Using PagerDuty’s schedule feature, you are able to dynamically assign administrative privileges based on who’s on call. This greatly reduces the scope of access. Teleport and PagerDuty together provide security best practices that are easy to enforce.

How Developers Can Benefit from Observability | IAmDevloper and Splunk's Mark Woods

DevOps teams have felt pressure from all sides to innovate faster and keep services reliable. The growing complexity of applications and cloud infrastructure create more challenges for everyone, but the tools that developers and SRE teams require have been disconnected - keeping everyone from working as an efficient team. IAmDevloper and Splunk’s Chief Technical Advisor EMEA, Mark Woods discuss how observability can help break down silos and promote agility.

ServiceNow partners with Intel, Volteo to provide EaaS, Edge to Service Solutions

Data is at the heart of every business today. But if we don’t understand the data or if the data isn’t accessible, it doesn’t do us much good. More important than the data itself is having the right data in the hands of the right people who can execute at the right time. This need has given rise to edge computing, which helps businesses glean insights from data by processing it as close to the source as possible.

How the Pandemic Impacted the Government's Cloud Migration Plans

“Cloud-first” has been a government imperative for many years, but the pandemic usurped this strategy, making “cloud-now” a priority. The results have been transformational. The cloud made wide-scale government telework possible, but it’s also given agencies the opportunity to test drive new cloud applications and experience the scalability and security benefits first-hand.

Automating Identity Lifecycle Management

The identification of every user making a request to a given system is vital to ensuring that action is only taken by, and information only returned to, those who need it. This happens in two steps: first, the requester is identified (authenticated), and then that identity is used to determine which parts of the application they are allowed to access.

Has the firefighting stopped? The effect of COVID-19 on on-call engineers

With digital becoming the primary channel for work, education, shopping, and entertainment in the last 18 months, it’s no surprise that workloads for technical teams and on-call engineers have increased. Data from PagerDuty’s inaugural platform insights report, The State of Digital Operations, highlights this reality. As of July 2021, the average number of events managed daily by PagerDuty is 37 million, with 61,000 of those being critical incidents.

SigNoz Community Call - August 2021

SigNoz is an open source alternative to DataDog, New Relic. In this community call, we discuss how the technical architecture in detail and how data flows in the backend services. We also discuss steps on how we can make SigNoz more performant including ways to benchmark performance at different loads. We hold a community call in the last/second last Saturday of every month.

DevOps' Problem with Speed-to-Market Explained: IBM MQ, Multi-Middleware Role in Deploying New Applications & Updates

If your organization is frustrated with how long it takes to roll out new applications and updates, they are not alone. Speed-to-market is an obsession at many companies today (see call-out box below), so anything that restricts or slows it down is a problem.

Artificial Intelligence will be the commander of the future wars

Artificial intelligence is one of several hot technologies that have the potential to transform the face of combat in the next years. The Joint Artificial intelligence Center was established by the Department of Defense to win the artificial intelligence war. AI might enable autonomous systems to execute missions, achieve sensor fusion, automate activities, and make better, faster judgments than people, according to some visions. AI is quickly developing, and those objectives may be met shortly.

The Value of Hyper-local Risk Intelligence

Every enterprise has a unique risk profile. This is based on a wide range of factors including geographic disposition, sector, the scope of security and resiliency plans, organizational size and structure, supply chain, and much more. Without the right customized tools tailored for your organization in place, it’s challenging to stay ahead of threats and disruptions to your people, places, operations, and digital systems.

Logging Best Practices: Knowing What to Log

First of all, don’t ask this! Instead of asking what to log, we should start by asking “what questions do we want to answer?” Then, we can determine which data needs to be logged in order to best answer these questions. Once a question comes up, we can answer it using only the data and knowledge that we have on hand. In emergent situations such as an unforeseen system failure, we cannot change the system to log new data to answer questions about the current state of the system.

Deploy ASP.NET Core applications to Azure App Service

The ASP.NET Core framework provides cross-platform support for web development, giving you greater control over how you build and deploy your.NET applications. With the ability to run.NET applications on more platforms, you need to ensure that you have visibility into application performance, regardless of where your applications are hosted. In previous posts, we looked at instrumenting and monitoring a.NET application deployed via Docker and AWS Fargate.

Migration to Microservice Architecture: A guide

The software design is perhaps the most important aspect that directly influences the ability to scale up, workload performance, the availability of the software, and the longevity of the software itself. It is also important to understand that traditional monolithic designs are still usable and widely used to fulfil many everyday goals. However, now we have a different problem. With the rapid growth of digital services , virtualization services, and an increasing dependency on cloud-based services

New feature: Templates for Incident Management

At Spike.sh , we are obsessed with making incident management more accessible to dev teams everywhere. With this goal in mind, we are always looking for ways to reduce the friction while setting up the Spike.sh platform. When we saw customers asking our advice for creating effective on-call schedules and escalations, we knew we had to do more than just good documentation - we needed a way to share best practices with our customers in the product itself.

Grafana Tempo 1.1 released: New hedged requests reduce latency by 45%

Grafana Tempo 1.1 has been released, and like our major version suggests, there are no breaking changes. If you’d like, please check out the release notes . But if you find that release notes can sometimes be difficult to decode, fret not! All the highlights are below.

Proactive Microsoft 365 & Microsoft Teams Service Delivery Monitoring for Enterprise IT & MSPs

Providing an effective service – especially in a world with constantly evolving needs – goes beyond standard operating hours. Imagine if a bank only kept your investments secure while they were ‘open’ during their hours of standard operation? Issues can arise at any time and having effective service delivery monitoring and support for enterprise’s IT teams and managed service providers is critical.

DataOps: keeping the data flowing with Model-driven Operations

If you’ve ever lived DataOps, you’ll know that it’s a challenge at the best of times. A day in the life of a typical data engineering team involves securing, releasing, debugging and stabilising complex and oftentimes fragile data pipelines. These pipelines can involve many source applications and intermediaries, and troubleshooting them under management pressure when it’s all going wrong is stressful.

Improving Web Page Load Time

HTTP/2 (originally named HTTP/2.0) was a major revision of the HTTP network protocol used by the World Wide Web published in 2015. Indeed, those in the Citrix/EUC ecosystem may remember Marius Sandbu investigating the benefits of HTTP/2 for NetScaler, Microsoft IIS, and Storefront users back in 2015/6. HTTP/2 was the first new version of HTTP since HTTP/1.1, which itself was standardized in RFC 2068 in 1997.

What Is Cloud Scalability? 4 Benefits For Every Organization

One of the major benefits of choosing the cloud over on-premise architecture is the ability to easily and quickly scale — but what does scalability mean in cloud computing? If your business is in the process of growing, it’s important to know your technology options so you can make informed decisions on how to scale. In this article, we cover scalability in cloud computing and its benefits.

This Month in Datadog: August 2021 (Episode 4)

Datadog is constantly elevating the approach to cloud monitoring and security. This Month in Datadog updates you on our newest product features, announcements, resources, and events. For the August 2021 episode, we take you behind the scenes to our NYC office, and sit down with two people from our Product leadership team.

Integrating Agile Requirements Designer with BlazeMeter

This video shows how you can use new functionality in Agile Requirements Designer 3.2 to integrate with scriptless functional testing in BlazeMeter. You can use test assets recorded in BlazeMeter to build test automation into your ARD flows, and you can export the generated paths in the model to test cases that you can run in BlazeMeter.

Applying Advanced vs. Basic Monitoring Techniques

Complex architectures, pressures to deploy faster, and demand for optimal performance have placed greater strain on monitoring teams and as a result, an increasing number are looking to implement more advanced monitoring techniques. Part of the initial challenge around this is understanding what advanced monitoring techniques actually are. In this article, I help clarify this by differentiating basic and advanced monitoring, with examples on how each would be applied to Postgres monitoring.

The "Perfect" Log Management Solution Is Invisible

It sounds like a wild claim, considering that billion dollar companies like Splunk, Datadog, New Relic, and Solarwinds are consistently making national headlines, for both good and bad reasons. Observability leaders are anything but invisible, so how can the perfect solution be different? Are they that far off?

Deploying your Gradle Build Cache Node using GCP

This tutorial is a follow-up to TurboCharging your Android Gradle builds using build cache . The key focus of this post is the remote build cache, a build speed acceleration technology that can be implemented for both local and CI builds. This is a technology worth knowing about because: Gradle provides a build cache node available as a Docker image. You can host this image in a number of ways.

Free ITSM webinar from IT expert Vawns Murphy | ITSM in the next normal

Watch this webinar on how to deliver effective services in the next normal, and learn about: The COVID-19 crisis completely changed our world. From an IT perspective, we’ve all had to change our ways of working, and will continue to evolve as everyone adjusts to the next normal. From a service desk and ITSM perspective, this will mean supporting colleagues in different ways, embracing new technology, and having plans in place for future lockdowns and supply-chain issues.

How To Begin The Digital Transformation Process In Marketing And Communications

The year 2020 proved that we need to “adapt” to the unknown, the “new normal” and the constant transformation of what would become a new state of living — one that meant being “displaced” and personally disconnected. If the global Covid-19 pandemic taught us a lesson, it was that we could incorporate new ways of living by self-isolating yet maintaining social integration via a digital presence.

3 Key Insights to Help You Build the Workplace for Today & Tomorrow

Everbridge sat down with two leading experts to discuss how innovative technologies are improving worker safety and operational functionality, and how firms can keep up. With such demanding times for the business world, it’s easy for companies to become fixated on survival, rather than thriving. But businesses that use unprecedented circumstances as a time to innovate and invest in new technology as well as rescoping the use case of their existing technologies, will emerge stronger than ever.

Balancing Healthcare Resilience with the Patient Experience

For healthcare systems, building resilience for the future is learned from adapting and responding to critical events and factoring in circumstances that are often unique to the communities they serve such as the patient population, size of the hospital and/ or community, and scope of services.

Legacy Vendors Beware: OpsRamp Aims to Transform Cloud Operations with New Self-Service Solution

The company is disrupting the age-old approach of selecting and deploying IT operations management software, which for years has required heavy proofs of concept and long buying cycles, with a solution that allows IT operators to sign up and begin monitoring cloud environments in a matter of minutes.

Podcast: Break Things on Purpose | Carmen Saenz, Senior DevOps Engineer at Apex Clearing

This week Ana sits down with Carmen Saenz, Senior DevOps Enginner at Apex Clearing and PhD student at DePaul University in Chicago, sits down this week to talk about her history in engineering. She brings to the table some anecdotes about her own time engineering chaos. Carmen goes into detail about the early days of chaos engineering and her work there, going from on-prem to the cloud, how she is always learning, her passion for teaching and more.

Indexes Matter-How Poor Index Management Can Ruin Query Performance

Ideally, database queries use the fewest possible resources: time, memory, bandwidth, etc. Lower resource consumption maps to better query performance. To find relevant data in a table, a database query relies on lookup operations, and a table index can help a query efficiently find the table values it needs. With an efficient, well-designed table index, a database query can find the table data it needs, avoiding the need to "scan"—or search through—all the table data.

What Is End-User Monitoring and Why It's Critical for Your Business

End-user experience monitoring is a practice designed to track user behavior or actions while interacting with a website or web application. The data gathered by end-user experience monitoring helps measure the impact of website and device performance on the end user’s journey. A meaningful end-user experience can help improve the enterprise’s operational efficiency, troubleshooting processes, employee productivity, and overall business value.

New Bucket Schema Option Can Protect You From Unwanted Schema Changes

One of the best things about getting started with InfluxDB over traditional relational databases is the fact that you don’t need to pre-define your schema in order to write data. This means you can create a bucket and write data in seconds, which can be pretty powerful to developers who care way more about the application they’re building than the mechanics of storing the data.

The evolution of Chaos Engineering and Litmus Chaos - Civo Online Meetup #12

Let's learn about Chaos Engineering! We'll be joined by Karthik Satchitanand, co-founder of Litmus Chaos to discuss why chaos testing is seen as a must for Cloud-Native practitioners in 2021, and how the introduction of LitmusChaos 2.0 evolves chaos engineering further. Civo's Saiyam Pathak will also be looking at chaos terminology and white paper run-through. Register now and don't forget to leave a question for the team - we'll answer the best ones on air.

How to Scale While Avoiding CI Pitfalls

Moving quickly while also maintaining high quality and availability is a balancing act that is hard to maintain but essential to the success of any software development lifecycle. Many teams today are using DevOps techniques to keep their development process fast without losing the quality and availability their users expect. CI/CD is at the center of this DevOps practice, serving as the essential bridge between your development and operations teams.

Hybrid Work Calls for Advanced Remote Management of Employee Devices

The most recent version of the Aternity Global Remote Work Productivity Tracker shows persistent remote work across all industries and an aggressive move to newer generation device platforms that are key to remote employee productivity. The integration between Aternity and Intel Endpoint Management Assistant enable IT to reduce device incident resolution times by half with one-click access to out of band remote management of Intel vPro platform devices, directly from the Aternity dashboard.

Datadog vs. New Relic vs. Scout

Application performance management is one of the essential steps that every business must complete to ensure that their products work as desired and give the best experience to the end-users. There are many tools for application management available in the market, but if you want to select the best one for your business, you would need to try out each tool one by one.

What Is DNS Blocking, and What Should You Know about DNS Security?

In the workplace, certain web pages can be a distraction for employee productivity—or worse, a disruption. If you’re a managed services provider (MSP), your customers may be interested in finding a way to control the types of websites their employees can access during the workday. One viable option for them to utilize is a DNS block to restrict access to certain web addresses on a given server. This article will help you understand what DNS block is, who uses it, and how it works.

When it comes to IT asset discovery, a snapshot won't suffice

How clear, complete, and colorful is your IT asset data? Is it like the picture below, where you see only a small portion of the bigger picture to answer important questions about your IT estate? Without color, you’re missing even more context. You need better IT asset discovery. Having a snapshot provides a limited view. It lacks context and is like managing your IT assets in silos. It’s cumbersome, inefficient, and ineffective.

What Will APM Look Like in the AIOps Era?

Historically, enterprise IT organizations have turned to application performance management (APM) systems to monitor and manage critical applications. However, throughout the world, enterprise organizations are suffering massive and systemic failures at an increasing rate. One of the main reasons these failures are increasing is that organizations aggressively seek to execute digital transformation initiatives.

What's new in Grafana Enterprise Metrics 1.5: Per-tenant usage metrics and a wildcard tenant for queries

We’re thrilled to announce the release of Grafana Enterprise Metrics (GEM) 1.5. While this release packs in a ton of enhancements and bug fixes, we’d like to dive into two particularly exciting features: per-tenant usage metrics and a wildcard tenant for queries.

Troubleshooting Cloud Services and Infrastructure with Log Analytics

Troubleshooting cloud services and infrastructure is an ongoing challenge for organizations of all sizes. As organizations adopt more cloud services and their cloud environments grow more complex, they naturally produce more telemetry data – including application, system and security logs that document all types of events. All cloud services and infrastructure components generate their own, distinct logs.

How to Determine Whether an Error is Really an Error

There is nothing worse than waking up to an angry customer complaining that your website is failing to accept their payment at checkout. This may be worrying for some since payments not being processed can be equivalent to losing money; however with Tag Spotlight, this should be a relatively quick problem to dissect. The key question here is whether this is an issue that all our customers are facing or an isolated event.

Securing VMware Tanzu Mission Control with Access Policies

If you haven’t had a chance to check out VMware Tanzu Mission Control, you’re missing out on one of the greatest tools available to manage Kubernetes. However, it’s not just for managing one cluster; rather, it delivers “fleet-wide” management with a focus on policy. Policy management is powerful because it enables us to, for example, limit access to certain users or prohibit pulls from specific container registries.

The SRE Guide to Hyper-Resilient Hyperscale for Cloud-Native Applications

Enterprises are putting more focus on high availability for their cloud applications and services. Enterprise Observability is a foundational element of hyper-resiliency in the cloud. Learn why. SREs are paid to ensure that their DevOps procedures produce quality software and meet operational Service Level Objectives (SLOs) for cloud applications. It’s not easy; and as the popularity of containerized cloud-based microservices grows, the challenges increase. One solution is hyperscale.

Scanning Dependencies in your sources using JFrog CLI and Xray

Security vulnerabilities and license violations should be found as early as possible and the earlier in the SDLC , the better. As part of the “ Shift Left ” vision, JFrog CLI and Xray now allow scanning dependencies directly from sources , on-demand, using a simple command line. This functionality allows benefiting from the same JFrog Xray vulnerability and license scanning capabilities, even before deployment to JFrog Artifactory.

Interlink Software secures a place on Constellation Research's ShortList for Using Artificial Intelligence in IT Operations.

Interlink Software’s capabilities have been gaining a good deal of industry analyst recognition in recent months. We’re pleased to announce yet more recognition; Interlink has been included on Constellation Research ShortList™ for“Using Artificial Intelligence in IT Operations (AIOps)”, Q3 2021.

Microservices Without Observability Is Madness

As I said before, Speed is King. Business requirements for applications and architecture change all the time, driven by changes in customer needs, competition, and innovation and this only seems to be accelerating. Application developers must not be the blocker to business. We need business changes at the speed of life, not at the speed of software development.

Safety Experts Plan for Fall

Everbridge recently hosted a Safety Experts Plan for Fall webinar, with an expert panel comprised of Dr. Rashid Chotani (Chief Medical Director/Senior Scientist, IEM), Steven J. Healy (President and CEO, Margolis Healy), Marisa R. Randazzo, Ph.D. (CEO and Founder, SIGMA Threat Management Associates) and James Podlucky (Industry Solutions Manager, Everbridge). The panel was moderated by Dan Pascale, Executive VP, Margolis Healy.

How to Mitigate the Effects of Floods on Your Supply Chain

Floods may now be an unfortunate counterweight to the wildfires that have come to characterize summers worldwide. In 2021 alone, floods wreaked havoc in Western Europe, China’s Henan province, and Tennessee and North Carolina in the United States. Hundreds of lives were lost, property damage ran in the billions, and global supply chains were thrown into disarray.

Monitor Conviva with Datadog

Conviva is a platform that helps businesses gain real-time insight into the overall performance and playback quality of their streaming video content. With video streaming workflows, slow start-up times and playback errors can hinder user experience and ultimately drive customers away. With Conviva, you can view key Quality of Experience (QoE) metrics, including video playback failures, rebuffering ratios, and other business-critical data to help monitor and enhance your viewer experience.

Wind River and Intel Collaborate on Leading 5G vRAN Solution for FlexRAN

Wind River today announced that the companies are jointly developing a 5G vRAN solution that integrates Intel® FlexRAN reference software for systems fueled by 3rd Gen Intel® Xeon® Scalable processors with built-in AI acceleration, and also features Intel® Ethernet 800 Network Adapters and the Intel vRAN Dedicated Accelerator ACC100 in concert with Wind River Studio.

The More the Merrier: Multi-Arch Docker Manifests with Buildx and Artifactory

The cloud native promise to be able to “build once, deploy anywhere” is nearly fulfilled. With containerization and Docker , we can build our applications and services for any environment, and set configuration at runtime. Well,… almost. Operating systems and apps still need to be compiled to execute on specific architecture types. Your software that’s been compiled for an AMD64 processor can’t run on an ARM-based machine, nor can one built for Linux run on Windows.

Moving communications service providers beyond traditional telecom

Communications service providers (CSPs) have been at the forefront of the pandemic, helping customers and entire industries make critical transformations almost overnight. When education, healthcare delivery methods, and supply chains were interrupted, CSPs stepped up and provided digital work environments and automations that became lifelines. Now, CSPs are laying the foundation for even more.

Call Handling - Relieve the burden of your service desk and on-call staff

These days, I keep encountering inquiries from various customers on the topic of call handling. Due to the current transformation, triggered by the increased use of home offices, it is becoming more and more important to make on-call staff more accessible. Often the already overloaded service desk is used for this purpose. Of course, this leads to a) a deterioration in the quality of the service desk and b) delays between the receipt of the problem and the start of problem resolution.

Getting Started with C# and InfluxDB

This post was written by James Hickey. Scroll below for full bio and picture following this article. Time series databases (TSDBs) can transform the way you handle streams of data in real time or IoT applications. In this tutorial, you’ll learn how to set one up in a C# application. Relational databases have their place. They’re great at things like data normalization, avoiding duplication, indexing over specific data points (like columns), and handling atomic changes to the schema.

The 7 Undeniable Benefits of Implementing Automated Alerting

What is the ultimate alerting strategy to make sure your alerts are meaningful and not just noise? Production monitoring is critical for your application’s success, we know that. But how can you be sure that the right information is getting to the right people? Automating the monitoring process can only be effective when actionable information gets to the right person. The answer is automated alerting.

Discovery to Monitoring, Automatic & On Your Terms

So you have this great discovery and auditing tool called Open-AudIT and you also have an amazing monitoring tool called NMIS . How can you automatically take your discovered devices and have NMIS monitor them…and why would you want to? With version 4.2.0 of Open-AudIT, we have re-implemented Integrations in an extremely easy-to-use yet extremely configurable way.

What's new: Updates to Event Intelligence, mobile, and more!

As we near the end of the Summer season, we’re excited to announce a new set of updates and enhancements to the PagerDuty platform. These updates will help our users and customers: Make sure to view the latest PagerDuty Pulse or learn more from our community team and developer advocates who have launched new programs to help you learn more about our latest products and best practices.

What is Load Testing? Processes, Types, Best Practices, Tools, and More

Any software development project will almost certainly have through several tests by the time it is finished, especially in an Agile testing environment where testing and development occur simultaneously. But, no matter how many tests you've conducted, there's really only one way to tell if your software can handle the actual demands your army of end-users will be throwing on it once it's nearly finished. It's known as load testing.

MySQL queries - faster than light (almost)

At the moment I’m working at a tool for migrating Icinga 2 IDO history to Icinga DB . Sure, one could also run IDO and Icinga DB in parallel for one year and then switch to Icinga DB if they only care for the history of the past year. But the disadvantage is: one would have to wait one year. Nowadays (in our quickly changing world) that’s quite a long time.

SigNoz - Open-source alternative to DataDog

More and more companies are now shifting to a cloud-native & microservices based architecture. Having an application monitoring tool is critical in this world because you can’t just log into a machine and figure out what’s going wrong. We have spent years learning about application monitoring & observability. What are the key features an observability tool should have to enable fast resolution of issues.

What Are Spot Instances? And When Should You Use Them?

EC2 Spot Instances, according to Amazon , can potentially save you up to 90% of what you’d otherwise spend on On-Demand Instances. While it’s been proven that even Reserved EC2 instances are cheaper than their On-Demand counterparts, it turns out Spot Instances are additionally capable of pushing their discounts beyond the reach of the Reserved Instances .

How PagerDuty and Rundeck Drive Operational Maturity at Trimble

As presented at PagerDuty Summit 2021 by Ali Soheili and Andrea Valenti of Trimble. Trimble PPM is a division of Trimble that provides construction project management solutions. PPM continues to roll out new SaaS and cloud-based offerings to their customers. To support their digital transformation, they are also modernizing their IT operations. They're invested in both PagerDuty and Rundeck to automation their incident response processes. Learn how PPM is shortening their incident resolution times while also reducing escalations.

Catchpoint Latest Release: Hercules

All of us here at Catchpoint are passionate about continuously innovating and improving our product to make our customers’ lives better. Part of this process involves regular product releases is – and this latest one, Hercules, is no exception. A big focus area for this release has been improving the usability, quality, and performance of the Catchpoint portal and agent.

Why Teams Performance Monitoring is More Than Just Measuring Uptime

When your organization’s very ability to share, collaborate and meet depends on Microsoft Teams performing, solely watching Teams service availability simply isn’t enough. Employees operate on the assumption that every bit of technology they rely on just works. Teams is the modern example of the most critical application to businesses today, with over 145 million daily active users.

Understand your services with Cloud Logging

What do you do when you know your service is having an issue? In this episode of Engineering for Reliability, we’ll show how you can use Cloud Logging to ingest, route, store, and view logs from your services and use them to fully understand application issues. Watch to learn how you can find issues faster, make your services more reliable, and keep your users happy.

The NetOps Expert - Episode 3: DX NetOps 21.2 Delivers Industry-Leading Network Monitoring Scale

In this episode of The NetOps Expert, Broadcom’s Robert Kettles and Jeremy Rossbach discuss the global pandemic and the evolution of networking and the impact on our current customers network operations and how these events have shed a new light on the need for advanced network monitoring scale?

Welcoming Scope Creep | An IT Journey to Monitoring Glory: Session 2

Now that your Network Management System is up and running, where do you go from here? “Scope creep” doesn’t (necessarily) have to be a bad phrase. Extending your monitoring out beyond the initial intention isn’t just encouraged, it’s commendable. Having all your business-critical information in one place speeds up troubleshooting and allows you to get in front of issues before they turn into problems.

VMware Tanzu Basic Edition: A Technical Overview in 5 Minutes

VMware Tanzu Basic edition delivers all necessary components required to have a production-ready Kubernetes cluster running on vSphere. This includes lifecycle management of Kubernetes clusters, networking integrations for all communication, and a container registry to store, secure, and manage all your container images. In this video, we give you a high-level look at everything to get you started.

Intel Endpoint Management Assistant and Aternity Integration

Together, Intel and Aternity are streamlining employee experience monitoring and remote device management with remote out-of-band management of Intel vPro® platform devices directly from Aternity that helps the Service Desk reduce incident resolution time by half. Intel and Aternity are collaborating to help you predict and prevent user frustration, verify optimal device configuration and performance, and drive successful business outcomes through rich cross-customer analytics and modern, built for business PCs.

Automate your LogDNA + PagerDuty Incident Workflow

LogDNA integrates with your PagerDuty instance to help trigger incidents based on log data coming in from your ingestion sources. This allows your teams to quickly understand when there are issues with your application, and where in the logs you can investigate to understand root cause. To help further accelerate your team’s ability to understand the state of your applications, we are introducing the ability to automatically resolve those PagerDuty Incidents directly from LogDNA.

Elastic and Cmd join forces to help you take command of your cloud workloads

We are excited to announce that Elastic is joining forces with Cmd to accelerate our efforts in Cloud security - specifically in cloud workload runtime security. By integrating the capabilities of Cmd's expertise and product into Elastic Security, we will enable customers to detect, prevent, and respond to attacks on their cloud workloads.

Happy birthday - 30 Years of Linux

Thirty years ago today, Linus Torvalds announced his free operating system to the world. As with many of the world’s greatest, Linux had humble beginnings as a very small pet project. The GNU was working on an ambitious free, public domain operating system but the project had been delayed, and enthusiasts were quick to adopt Linus’ new project.

How to Monitor Your AWS Workloads

A WS is a comprehensive platform with over 200+ types of cloud services available globally. As organizations adopt these services, monitoring their performance can seem overwhelming. The majority of AWS workloads behind the scenes are dependent on a core set of services: EC2 (the compute service), EBS (block storage), and ELB (load balancing).

Build an Automation-First Security Culture

When you think about how to automate processes within the IT industry, your mind probably goes first to tools. After all, the past decade has witnessed an explosion of tools from across the industry that promise to make it easy to automate virtually every aspect of IT operations — from low-code development solutions that automate coding, to release automation tools for applications, to automated monitoring and security platforms.

Transform your Data Center with Confidence | Joint Webinar by CloudFabrix and Verge.io

Verge.io is partnering with CloudFabrix, a leader in artificial intelligence for IT operations, to chat about why software-defined everything is the way to go. This is a great opportunity to learn how to transform your current data center operations using the latest technology and intelligence. Here’s what we’ll cover: – How artificial intelligence and data center virtualization operating systems work together to change the thinking around traditional data centers.

ServiceNow Alternatives (FAQs 2021)

There’s no shortage of ServiceNow alternatives on the market for your business to explore. As service-focused organizations look for ways to optimize workflows, leveraging a sophisticated workflow automation solution is a strategic move. This could be the difference between good and exceptional customer experiences. ServiceNow isn’t the only answer to automation.

Is it All About the Brand? Yes and No!

Working for a company, you always want to think that your product or service is superior to the competition. As a Marketing executive, a major part of my job is to highlight why Ribbon is a leader in delivering IP Optical and Cloud & Edge solutions to the largest service providers, enterprises and critical infrastructure companies all over the world. But how do our customers perceive us? What do they really think of the Ribbon brand? How do we fare against our competitors in customers’ minds?

Challenges and Opportunities of Going Serverless in 2021

While we know the many benefits of going serverless – reduced costs via pay-per-use pricing models, less operational burden/overhead, instant scalability, increased automation – the challenges of going serverless are often not addressed as comprehensively. The understandable concerns over migrating can stop any architectural decisions and actions being made for fear of getting it wrong and not having the right resources.

Upgrading: Building for the Future

At the heart of RapidSpike is a development team who are passionate about pushing their abilities, learning new technologies and ultimately driving our software and product forward. We’ve researched and developed many cutting-edge features over the last 6 years but we’re shifting our focus this year. We’re taking a step back from new feature development in favour of upgrading, exploring and applying new technologies to existing functionality.

MTBF Is an Integral Part of Business Operations - Here's Why

In today’s fast-paced digital world, your customers expect your services to be available 24 hours a day, seven days a week. If your services are unreliable, these customers will likely take their business elsewhere — and spread the word. To retain their business, you must understand and optimize your service and system health to ensure your services are reliable. Gauging your service and system health requires much more than knowing whether they’re on or off.

New 'Pod Status and Logs' Dash Saves Time and Unifies Execution

Time is invaluable. Besides being one of those can’t-argue-with universal truths, this is also one of the guiding principles behind Komodor; the promise behind our ‘troubleshooting efficiently and independently’ motto. ‘Pods Status and Logs’ is the latest of our timesaving features that enables you to quickly drill down in the pods of an unhealthy service, all from the comfort of your Komodor dashboard.

How to Troubleshoot Kubernetes with Confidence - 2021 Cloud-Native Days Summit

We recently attended the 2021 Cloud-Native Days Summit, where our co-founding CEO Ben Ofiri gave a lightning round talk on How to Troubleshoot Kubernetes With Confidence. In case you missed it, here’s a recording and transcript for your convenience.

How Does a Digital Experience Score Optimize the Workplace

User experience is subjective. For example, asking tourists visiting New York City about their experiences gives different answers. Likewise, end-users who work remotely with different resources and disparate assets can have varied experiences with their business applications. How can IT teams gather this experience data and react faster to improve experience? The answer is Digital Experience Scores.

Basic SQL Server Query Tuning Secrets Every SQL Admin Should Know

The performance of your applications is a complex, multi-layered puzzle. Performance can be negatively impacted at the application layer or even by remote calls to networked services. However, the most common bottleneck for applications is the data storage layer. The most common data storage tier for applications is a relational database, whose performance can vary widely depending on query optimization.

Adding a Developer Portal to FluxCD

The idea to fully manage applications and infrastructure using a Git-based workflow, or GitOps, is gaining a lot of traction recently. We are seeing more and more Shipa users adopting GitOps as the cloud-native deployment methodology. While it is no secret that ArgoCD and FluxCD are by far the most used tools today, we see FluxCD users trying to address the challenges below.

Fargate vs ECS - Comparing Amazon's container management services

Kubernetes and containerization of applications brings many benefits to software development, enabling speed, agility, and flexibility. The maturation of the Kubernetes ecosystem accelerated quickly in the last few years, leaving users with a multitude of choices when it comes to Kubernetes tooling and services. The major cloud providers (AWS, Azure, and Google Cloud) have introduced services specifically to help users run their Kubernetes applications more efficiently and effectively.

Rails + observIQ; Chapter 1: Log management at the core of Rails application development

Logging is useful in building, managing and debugging Rails applications. Most logging functionalities are built into the application, and it is fairly simple to find the logs. However, as your applications scale up in volume, it becomes difficult to trace the source of an issue. That’s when you want to implement a cloud based log management system to get a unified view of all logs from your Rails application.

Observability and Cyber Resiliency - What Do You Need To Know?

Observability is one of the biggest trends in technology today. The ability to know everything, understand your system, and analyze the performance of disparate components in tandem is something that has been embraced by enterprises and start-ups alike. What additional considerations need to be made when factoring in cyber resiliency? A weekly review of the headlines reveals a slew of news covering data breaches, insider threats, or ransomware.

It's Time to Get Hip to the SBOM

The DevOps, IT security and IT governance communities will remember 2021 as the year when the Software Bill of Materials , or SBOM, graduated from a “nice to have” to a “must have.” Around for years, the SBOM has now become a critical DevSecOps piece, which everyone must thoroughly understand and incorporate into their SDLC (Software Development Lifecycle).

Deploy Puppet Enterprise agents with HashiCorp Terraform on Azure VMs

HashiCorp Terraform is an open source Infrastructure as Code (IaC) tool that is widely used to deploy cloud infrastructure in the public cloud, such as AWS and Azure, along with on-premises VMware vSphere environments. One of the challenges is developing a method for bootstrapping the instances with configuration management agents such as the Puppet Enterprise agent.

Six months in: How the SaaS that was built in 7 days is going

A few weeks before I sat down to write this article, I reshared my two month review of OnlineOrNot around the internet. Surprisingly, the article was quite popular: So I thought I'd clear up some confusion for the folks who only just read my two month review: I started OnlineOrNot on February 25, 2021, shipped the first version for people to use on March 2, 2021, and here I am in August writing the six month review.

Kubernetes monitoring with Sysdig

Kubernetes has multiple moving pieces that you need to monitor, such as the elements that make up the Control Plane. As your clusters grow, collecting metrics from all the Kubernetes sources becomes highly tedious. Comprehensive monitoring for Kubernetes reduces the operational complexity by providing the visibility you need to: Sysdig Monitor offers an out-of-the-box user experience for monitoring your Kubernetes environment, including pre-built dashboards and a comprehensive alerts library that you can use right away.

API Gateway with Gloo Edge Overview

Watch Kamesh Sampath (Field Engineer, Solo.io) discuss Gloo Edge, Gateway API and API management on Kubernetes. Gloo Edge is a cloud-native API Gateway and Ingress Controller built on Envoy Proxy to facilitate and secure application traffic at the edge. Here we have covered the management and deployment if your API gateway with Gloo, and share how to set up and manage your APIs in less than 3 minutes!

How to Discover Devices and Connections With Engineer's Toolset

Learn how you can easily discover MAC addresses within your local network and match them to IP addresses with the MAC Address Discovery tool from SolarWinds Engineer's Toolset. SolarWinds® Engineer’s Toolset (ETS) helps you monitor and troubleshoot your network with the most trusted tools in network management. Version 11.0 now comes with an intuitive web console for 5 of the most popular tools - Response Time Monitor, Interface Monitor, CPU Monitor, Memory Monitor, and TraceRoute.

Message to Software Developers in the Payments / Financial Services Industry

OverOps CEO Rod Squires records a special message for software developers in the payments / financial services industry. OverOps instantly pinpoints at runtime why critical issues break backend Java and.NET applications, eliminating the detective-work of searching logs to reproduce critical errors.

Best practices to help retailers make the grade for the holiday season

It’s hard to believe we’re already talking about the return to school, but it’s set to be a big one. In fact, this year promises to be the biggest in the last five years. The National Retail Federation expects back-to-school spending to reach $37.1B , up from $33.9B last year. Back-to-college spending is also expected to rise, reaching $71B this year. This increase is buoyed by parents and students gearing up for their first in-person classes after a year of virtual learning.

Appdynamics vs. Splunk vs. Scout | Key Features Compared

Application Performance Monitoring is a crucial necessity for modern businesses. No application can survive without a proper monitoring system. There are way too many things that can go wrong, so you must put your best foot forward in terms of choosing a monitoring system that is effective and economical at the same time. This guide aims to help you decide between three top application performance monitoring tools in the market - AppDynamics, Splunk, and Scout.

Tame the Alert Storm

In the past, troubleshooting an IT service issue could be quite simple. For example, an application disruption could often be isolated to a physical server or small group of servers that neatly fit into the domain of a single team that managed the company’s servers. However, with the dynamic landscape in modern IT environments, this is very rarely the case. Over time, you accumulate IT systems, which usually means you deploy tools to manage them.

What Is Network Traffic Analysis? A Helpful Walkthrough

Network traffic analysis is the method of collecting, storing, and analyzing traffic across your network. Traffic data is collected in or near real time so you can have up-to-the-second information about what’s happening. This allows you to take action immediately if a problem arises. You can also store this data for historical analysis.

How we're supporting the success of our community and customers with our recent funding rounds

This morning, we announced that Grafana Labs has raised $220 million in Series C funding . As with our previous rounds in 2019 and 2020 , this funding will enable us to focus on accelerating the development of our open source observability platform and supporting the success of our community and our customers.

The HAProxy APIs

The HAProxy load balancer provides a set of APIs for configuring it programmatically. Although many people enjoy the simplicity of configuring their HAProxy load balancer by directly editing its configuration file, /etc/haproxy/haproxy.cfg, others want a way to do it without logging into the server. Or, they want a way that integrates with bespoke software. For example, they want to add pools of servers to the load balancer programmatically as a part of their CI/CD deployment pipeline.

Deployment Choices for eG Enterprise

You have two choices when deploying eG Enterprise: Wherever you choose to locate your eG Manager, eG Enterprise does not and will never collect data from your systems. There is never a data feed going from eG Manager to any outside system unless specifically configured by the customer and we do not incorporate any dubious call-home technologies. Before installing eG Enterprise, you will also need to consider the factors discussed in the Where to locate the eG Manager?

Self-Compassion Instead of Self-Blame

The tech industry is competitive and not without challenges. People are always growing and improving by pushing their limits. Innovation comes in many forms. In order to foster a healthy culture while allowing people to flourish, organizations must carefully enact policies. Growth should be encouraged while discouraging competition and comparison. One of the core policies organizations implement to achieve these goals is blamelessness.

The Best Things Come in Content Packs: Synthetic Managing and Third-Party APM

We recently announced the new Splunk App for Content Packs, your single source for all the goodness that is content packs. This new app makes it easier than ever to get started with Splunk for IT use cases. Individual content packs come with prepackaged content and out-of-the-box searches and dashboards, helping streamline workflows and ensuring you get the most out of your usage with Splunk IT Service Intelligence (ITSI) and Splunk IT Essentials Work (ITE Work).

Making Your Service Desk the GOAT

Imagine every time you visited the service desk to submit a ticket, you were greeted by the boisterous drums from the medley of the Olympics—you’d feel like a gold medalist! Though a theme song may seem far-fetched, rolling out white glove treatment can elevate the service desk in the eyes of your users. Let’s explore how you can land a 10 out of 10 in service delivery by embracing more user-centric service management strategies.

What Is Network Traffic and How Do You Monitor It?

“Network traffic” is a term that describes the influx and outflux of network packets within an organization’s network. Understanding and monitoring this traffic is an important step in protecting an organization’s health. This blog post discusses what network traffic is, the different types, and how you can monitor it.

The Auvik Guide to Basic Switch Configuration

Though I’m a big believer in the importance of network configuration management , even I’ll admit that network switch configuration from a command-line interface (CLI) is still one of the most fundamental network engineering skills you can have. It’s also one of the harder things to pick up if you’re new to the field. And it’s not just knowing what commands to enter. It isn’t always obvious what should be part of a basic configuration.

Top 5 API Testing Tools

Today’s software testing trends show the growing demand for more efficient and automation-oriented API testing. Many of the current test automation solutions focus on the UI, while most API-level testing is still done manually. As a result, testers are in need of easy-to-use, intelligent automation tools for testing APIs — improving their productivity and efficiency, while also reducing time-to-market.

Australian banking outlook: Compliance culture depends on innovation

Banking institutions are risk engines. Customers know this. Bankers know this. Financial regulators know this. Although no one likes to be turned down for a loan, risk due diligence is actually a good thing. The global financial crisis of 2007-2008 proved that risky practices of the past were not good for the economy, for society, or for confidence in the sector as a whole. The challenge is finding the balance to meet responsible lending criteria while ensuring consistent customer satisfaction.

The Rise of Distributed Tracing with OpenTelemetry

In the past few years, distributed tracing has emerged in the global DevOps consciousness as an indispensable tool in the microservices arsenal. In April 2019, the open source observability community rose to the challenge, uniting the energies that were previously divided between OpenTracing (a vendor-agnostic API to help developers instrument tracing into their code base) and OpenCensus (an open-source project that emits metrics and traces in application code) into a new project called OpenTelemetry.

The Syslog Staying Power

Some classics never go out of style, like a good pair of boat shoes or cowboy boots, depending on where you live. In the logging world, syslog is this classic. For more than 30 years, the syslog protocol has been a standard for logging. When we talk to users about what type of logs they collect and how they send them to SolarWinds ® Papertrail ™ , syslog always comes up. “Our application logs and server system logs are sent to Papertrail.

How to Choose a Database for Serverless Applications

People can easily be misled by the term “serverless.” A serverless application doesn’t mean that the application doesn’t need any server to run. When you’re developing an application with the serverless approach in mind, you have two options for databases: host it yourself or go with a serverless database. The first route entails you having to maintain the servers (on-premises or in the cloud), taking care of securing the database, backing it up, scaling, etc.

21 basic computer security tips

The Internet is a tool that, without a doubt, offers a great amount of positive aspects in the daily life of our society, like instant communication, easy access to information… among many other benefits. But also it has negative aspects, and one of the big ones is cyberattacks. That is why we give you today 21 computer security tips for beginners!

An Introduction to Anomaly Detection

In early 1900, Sakashi Toyoda invented a loom that automatically stops when the thread breaks, limiting the need for someone to watch the machine constantly. This approach was later named “Jidoka” and became one of the two pillars of the TPS (Toyota Production System) with just-in-time production representing the second pillar.

Who is Driving Enterprise-wide Innovation, Transformation, Cloud Strategy and More?

Today’s decision-making is different than even a few years ago. More “data” is used, and the data inputs take several forms, including humans. A big part of today’s strategy and decision-making at enterprise-class organizations are committees, made up of a company’s subject matter experts and relevant stakeholders for a critical company initiative.

Featured Post

Maintaining Internet Privacy for Your Business

Every time you use the internet, online service providers collect some data about you and your business. Internet privacy isn't just a concern of large corporations. According to statistics, up to 43% of cyber-attacks target small businesses. Furthermore, only 14% of these vulnerable organizations have effective mitigation measures in place. As digital data grows, cybersecurity should become your highest priority.

Best practices for collecting and managing serverless logs with Datadog

Logs are an essential part of an effective monitoring strategy, as they provide granular information about activity that occurs anywhere in your system. In serverless environments, however, you have no access to the infrastructure that supports your applications, so you must rely entirely on logs from individual AWS services when troubleshooting performance issues.

Introducing the Spike.sh Alert Reliability Engine

At Spike.sh, our mission is to help dev teams understand and resolve production issues faster. At the core of this is our Alert Reliability Engine, whose job is to make sure that a team member always gets an alert on their preferred channel. Currently, we support 7 channels - phone call, SMS, mobile push notifications, email, Slack, Microsoft Teams and Discord. We wanted to give you a peek into how we achieve high deliverability across these channels.

Transforming Employee Experience with Freshservice Virtual Agent- A Freshworks Story

Employee expectations of the workplace are rapidly evolving, and consumer-like experiences quickly become a benchmark to measure internal support teams. A research report from Harvard Business Review Analytics Services and Freshservice found that 82% of those surveyed say employee happiness is impacted by how well workplace technology performs.

ServiceNow a Leader in Gartner Magic Quadrant for CRM Customer Engagement Center

I’m thrilled to announce that Gartner has named ServiceNow a Leader in its 2021 Magic Quadrant for the CRM Customer Engagement Center for the second year in a row.1 We believe this is external validation of our strategy and vision to improve the end-to-end customer experience through our Customer Service Management solution. Our mission in Customer Workflows is to transform the customer service and support market for the better.

What Is Honeycomb's ROI? Forrester's Study on the Benefits of Observability

Register for the webinar and download the full study to see and apply Forrester’s financial model to determine the observability ROI for your organization. Many teams want to adopt observability and Honeycomb—but run into budget roadblocks because budget holders may not clearly understand the quantifiable benefits to their end users, their teams, and the bottom line.

12 SaaS Metrics Every SaaS Company Should Be Monitoring

The market for Software-as-a-Service (SaaS) continues to grow exponentially. But more SaaS companies are reporting lower margins these days, despite historically enjoying 60-90% margins on average. Thin margins limit a company's ability to expand in the future. They also weaken the company's market valuation and, over time, they can chip away at the company's gross margins . These scenarios are often not a result of companies failing to make enough money to reap healthy margins.

How we use the k6 load-testing tool for developing Grafana

On the last day of GrafanaCONline in June, our CEO Raj Dutt announced that Grafana Labs had acquired k6 , the company behind the open source load-testing tool. In fact, our relationship with k6 had started more than two years earlier. At the beginning of 2019, we were working on replacing Grafana’s “remember me” cookie solution with a short-lived token solution for the Grafana 6.0 release.

Mobile Vitals - Four Metrics Every Mobile Developer Should Care About

Slow apps frustrate users, which leads to bad reviews, or customers that swipe left to competition. Unfortunately, seeing and solving performance issues can be a struggle and time-consuming. Most developers use profilers within IDEs like Android Studio or Xcode to hunt for bottlenecks and automated performance tests to catch performance regressions in their code during development. However, testing an application before it ships is not enough.

8 Risks You Need To Mitigate During Cloud Migration

Migrating workloads to the cloud can be tricky. In fact, a study Virtana conducted earlier this year found that 72% of respondents had to move applications back on-premises after migrating them to the public cloud because they ran into a variety of problems. Clearly, organizations need to address these showstoppers.

How to use Kibana time shifts, advanced formulas, and dynamic colors

Ad hoc analysis capabilities in Kibana enable you to visualize your time series data easily and intuitively. In this video, learn how to use time shifts, advanced formulas, and dynamic colors in Kibana to examine data over different time periods, author your own metrics to use in visualizations, and highlight important values in tables.

How Alert Notifications Make Incident Response More Effective

HR people have a saying: right person, right place, right time, meaning that the right resources can make all the difference when it counts. The same goes for Incident management and response, where very often the wrong person, place, or time can contribute to mounting catastrophe. As systems grow, the right person really can make the difference during an outage simply due to command or knowledge of the system.

Securing Serverless Applications with Critical Logging

We’ve seen time and again how serverless architecture can benefit your application; graceful scaling, cost efficiency, and a fast production time are just some of the things you think of when talking about serverless. But what about serverless security? What do I need to do to ensure my application is not prone to attacks? One of the many companies that do serverless security, Protego, came up with an analogy I really like.

How MBTA modernized incident response to reduce alert fatigue and improve collaboration

Citizens utilize mobile and consumer-facing applications in everyday life, so it’s no surprise that they demand seamless access and high availability of government services online. Whether it’s making payments or applying for benefits, citizens and constituents alike expect these services to be available around the clock.

Transforming the Gaming Industry with AI Analytics

In 2020, the gaming market generated over 177 billion dollars, marking an astounding 23% growth from 2019. While it may be incredible how much revenue the industry develops, what’s more impressive is the massive amount of data generated by today’s games. There are more than 2 billion gamers globally, generating over 50 terabytes of data each day.

Node.js Security and Observability using Lightrun & Snyk

As developers, we spend a lot of time in our IDEs writing new code, refactoring code, adding tests, fixing bugs and more. And in recent years, IDEs have become powerful tools, helping us developers with anything from interacting with HTTP requests to generally boosting our productivity. So you have to ask — what if we could also prevent security issues in our code before we ship it?

Looking ahead to general availability of Collapsed Reply Threads

We appreciate all the incredible feedback the Mattermost community has provided about Collapsed Reply Threads since launching in beta in Mattermost Cloud and Self-Managed v5.37 and later. We are working as quickly as possible towards resolving known issues and then promoting this feature to be generally available.