May 2021

Costa Rican educational org realizes improved visibility into its IT infrastructure, enhanced IT health and performance

The Ministry of Public Education (MEP) serving Costa Rica is an educational organization with 80,000 employees. MEP is the technical and administrative body responsible for the accreditation, supervision, auditing, inspection, and control of private schools, beginning with preschool. The problems faced MEP is a government agency supported by five data centers and 181 servers.

Understanding Azure Logic Apps Resource Types

In recent years, businesses innovate and migrate at a faster pace by using Microsoft Azure cloud-native technologies. Azure Integration Services, an industry leading Integration Platform as a Service (iPaas) provides multiple offerings like Logic Apps, Service Bus, API Management, Event Grid, Azure Functions & Data Factory to meet application integration scenarios. Azure Logic Apps gaining good interest with more than 40,000 customers using it.

Comparing The Private Connectivity Offerings Of AWS, Google Cloud & Microsoft Azure

AWS, Google Cloud and Microsoft Azure accounted for an estimated 58% of total cloud spend in Q1 2021. Businesses are considering ways to improve their connectivity to these three leading hyperscale providers - and are increasingly turning to private connectivity. In this blog, we take a look at the private connectivity offerings of AWS, Google Cloud and Microsoft Azure.

Packet Loss Testing and Reducing Guide + Recommended Tools

If you’ve ever encountered a slow file download or a frozen/lagging video, you’ve experienced packet loss. Under certain circumstances, these might be minor inconveniences, but packet loss on a larger scale can be financially detrimental to businesses. Fortunately, there are steps you can take to diagnose and reduce packet loss.

Top 3 NLP Use Cases for ITSM

What is NLP Natural Language Processing is a specialized subdomain of Machine Learning which is generally concerned with the interactions between the human and machine using a human verbal or written language. NLP helps in processing huge volumes of text which would take a significant amount of time for a human to comprehend and process otherwise. Hence a lot of organizations take advantage of NLP to gain useful insights out of their text and free formatted data.

Statuspal vs

If you’re wasting precious time and resources responding to customer emails and phone calls whenever an incident arises - now sounds like a good time to set up your very own status page! A Statuspal status page is a professional and effective way to communicate incidents and maintenance updates internally for your team or publicly to your customers. You would be right in thinking that there are many options available and one of the largest operators in the market is

Microsoft Silverlight End-of-Support - Ten Steps to Take Now!

The planned date of October 12, 201 for Microsoft Silverlight End-of-Support is rapidly approaching. This isn’t new news. It was originally discussed in a Microsoft blog on moving to HTML 5 premium media in July 2015. Currently, the only browsers that continue to run Silverlight are Internet Explorer (IE) 10 and 11.

The 7 SRE Principles [And How to Put Them Into Practice]

Whether you're just adopting SRE or optimizing your current processes, we can help. We’ll explain the 7 key principles of SRE and how to put them into practice. So, what are the SRE principles? The fundamental SRE principles are: SRE is a method that operates through principles. Instead of prescribing specific solutions, it guides you with best practices. These SRE principles help organizations decide what's best for them. Once you understand the principles, you can apply them in many areas.

Feedback - From Slack to Discord - 13 months later

This post is our third one sharing our real-world experience using Discord for more than one year. I think it is pretty interesting for any company interested to get the pros and cons of using Discord over Slack. At Qovery, we are a remote-first software company. When we decided to move out of Slack to Discord 13 months ago, we were only 3 developers on the team.

Incident response: how to keep tech problems from becoming people problems

Subscribe to Work Life Get stories about tech and teams in your inbox Subscribe When one of your IT services is on fire there’s no time to waste. Especially if that fire is blocking your users from getting stuff done. Rapid resolution tends to eclipse all else during an incident, often causing your team to ignore or forget pieces of the incident response process – like keeping people in the loop.

Announcing support for Oracle Arm-based Ampere A1 instances

Arm processors have long been at the center of mobile computing, powering billions of smartphones, tablets, smartwatches, and other IoT devices. Today, these processors are beginning to see broader adoption in the cloud as they promise better performance, higher energy efficiency, and lower costs than their x86-based predecessors. Just this week, Oracle announced its new Oracle Cloud Infrastructure Ampere A1 Compute platform, built on the Ampere Altra Arm processor.

Five worthy reads: Distributed cloud is the future of cloud computing

Five worthy reads is a regular column on five noteworthy items we’ve discovered while researching trending and timeless topics. Distributed cloud allows organizations to bring cloud computing closer to their location. This week we look at why it’s the future of cloud computing.

Serverless Stonks checker for Wall Street Bets: week 3 activity report insights

A few weeks ago we posted the “How we built a serverless Stonks checker API for Wall Street Bets” article. And ever since, we’ve seen quite a lot of volume in the Stonks checker app. In this follow-up article, we will show you some interesting findings around the API. Over the past three weeks, we have seen a good amount of usage of the API we set up. You can see that there was a nice spike soon after the story broke.

How to run ECS Anywhere workloads using Ubuntu on any infrastructure

ECS Anywhere allows you to use Amazon Web Services’ container service outside of the AWS cloud, and Canonical is proud to be a launch partner for this service. Using Ubuntu as the base OS for your ECS clusters on-prem or elsewhere will allow you to benefit from Ubuntu’s world-leading hardware support, professional services, and vast ecosystem, in turn allowing your ECS clusters to run with optimal performance everywhere you need it.

Triggering automated tests with Jira and Xray - Demo Den - May 2021

Sérgio Freire, Head of Solution Architecture and Testing Advocacy for Xray, shows how to trigger test automation from Xray Cloud using GitLab and report the results back to Xray in a Test Plan. You can easily replicate this process for other frameworks including Jenkins, Robot Framework and Bamboo.

How important is middleware monitoring for organizations?

As any organization grows and goes wider and bigger, their infrastructure and the IT landscape also expands. The “N” number of dependencies and running tasks at a moment needs careful monitoring. However, bigger the organization, the more complex and difficult it gets to monitor the transitions and communication. Without the smooth transactions and perfectly running operations team, day-to-day business would go through many hurdles.

Working from home: Remembering an unforgettable year

March 11, 2020. Does that date ring a bell? For many, it won’t soon be forgotten. It’s the day when many companies across the globe closed their doors to keep workers safe from the pandemic. ServiceNow employees were told, “Be safe and go home.” This marked the official beginning of our work-from-home experience, one that turned our work lives—and many personal lives—upside down.

AWS Graviton: What Is It And How Does It Compare To Graviton2?

When Amazon Web Services (AWS) launched its new Arm-based processors, some circles believed it was a gamechanger for the public cloud markets. To begin with, it was the first time Arm architecture would roll out for enterprise-grade utility, and at a colossal scale. Arm processors had only run on smaller, less demanding devices such as iPhones. So why adopt it for much more demanding workloads in cloud services?

The Confident Commit | Episode 3: Taming infrastructure with HashiCorp's Armon Dadgar

CircleCI CTO and host of The Confident Commit podcast Rob Zuber is joined by HashiCorp co-founder and CTO Armon Dadgar for a conversation about the inspiration of HashiCorp, infrastructure challenges and opportunities, and the future of security. Listen along for the insight story of HashiCorp's origins and early days, as well as keen insights for managing infrastructure and ways to better deliver software to infrastructure environments from two of tech's top leaders.

What Happens When I Execute a Query?

To many developers and system administrators—and even to some database administrators—database engines are a black box. They’re complex pieces of software that, in some cases, even have their own operating systems—the database engine manages its own memory, reads and writes to disks, and handles numerous other system functions. In this post, you’ll learn about a specific feature of database engines—query optimization.

How to alert on high cardinality data with Grafana Loki

Amnon is a Software Engineer at ScyllaDB. Amnon has 15 years of experience in software development of large-scale systems. Previously he worked at Convergin, which was acquired by Oracle. Amnon holds a BA and MSc in Computer Science from the Technion-Machon Technologi Le' Israel and an MBA from Tel Aviv University. Many products that report internal metrics live in the gap between reporting too little and reporting too much.

Easily Automate Across Your AWS Environments with Splunk Phantom

When running Splunk Phantom with AWS services, it can be tricky to make sure Splunk Phantom has the right access. When you’re managing multiple AWS accounts, the effort to configure Splunk Phantom’s access to every account can feel insurmountable. Fortunately, Amazon has the Security Token Service to solve this problem with temporary credentials, so we’ve integrated it with Splunk Phantom!

Deliver Real-Time Alerts From Facility Management Systems

Facility managers, including service technicians, are expected to operate their facilities safely to meet the expectations of customers. They focus on the smooth functioning and maintenance of many components that fall within the scope of their facility. Typical components include roads, pavements, HVAC and plumbing systems. As a facility manager, staying on top of these siloed and geographically dispersed systems can be challenging.

Synthetic Monitoring of Amazon WorkSpaces

Amazon WorkSpaces enables you to provision virtual, cloud-based Microsoft Windows or Amazon Linux desktops for users. WorkSpaces eliminates the need to procure and deploy hardware or install complex software. You can quickly add or remove users as your needs change. Users can access their virtual desktops from multiple devices or web browsers.

Announcing Harvester Beta Availability

It has been five months since we announced project Harvester, open source hyperconverged infrastructure (HCI) software built using Kubernetes. Since then, we’ve received a lot of feedback from the early adopters. This feedback has encouraged us and helped in shaping Harvester’s roadmap. Today, I am excited to announce the Harvester v0.2.0 release, along with the Beta availability of the project!

Polystream Stopped Implementing Technology and Implemented A Mindset - Customer Stories

Polystream is changing the way we think about video game streaming, and with xMatters, they know that incidents won't keep them from achieving their goals. In this customer chat, join Tracey McGarrigan, Chief Marketing Officer at Polystream, Cheryl Razzell, VP of Engineering, and xMatters own Laura Meadows, VP EMEA Region, as they discuss Polystream's ongoing ambitions and how xMatters helps their growth. Plus, make sure you don't miss why Tracey would describe xMatters as Polystream's comfort blanket!

Introducing ManageEngine Academy, a thought leadership content hub for IT leaders

ManageEngine, which started out small a couple of decades ago, now solves the IT management problems of millions of customers worldwide by providing complete, simple solutions. The story of our growth is one that we’ll always be proud of. But this story is built on years of learning, unlearning, and refining our processes. The stories of our internal struggles have made the story of our success possible and taste a lot sweeter.

Announcing support for Amazon ECS Anywhere

Amazon Elastic Container Service (ECS) is a managed compute platform for containers that was designed to be simple to configure, with opinionated defaults to help users get started quickly. ECS customers can run containerized workloads on either Amazon EC2 instances or the serverless Fargate platform without having to maintain a control plane—and can easily integrate ECS with other AWS resources, like Network Load Balancers, to architect their infrastructure.

Ways AI is Driving More Efficient Application Performance Monitoring

In the digital age, the speed and performance of apps and websites have a huge impact on the customer experience. To ensure a high level of quality, Application Performance Monitoring (APM) refers to the process of tracking the performance and availability of software systems. Let’s look at what Application Performance Monitoring is, how AI and machine learning are being applied to stay ahead of the competition, and several real-world use cases.

Resolving Issues Caused By the May 6th Neustar UltraDNS Outage - A True Partnership Experience

At Catchpoint, our award-winning support team aims to be a partner, not just a gateway to the tool. Earlier this month when UltraDNS, a major DNS provider, went down, they found themselves faced with nine support tickets within one hour. Our customers were experiencing outages on their websites and online services. They needed urgent help from Catchpoint in understanding what was causing the disruption, so they could quickly resolve the situation.

What is server testing, and why should you do it?

Whether you are running a website, a SaaS app, or something else, you need to ensure that your digital properties deliver the best possible performance. Factors such as server speed or storage capacity impact performance, which is why server testing is so important. Server testing will give you a clear idea of your app or site's performance and what you can do to make it run even better. This article will take a closer look at server tests.

Four KPIs for Measuring Hybrid Worker Productivity

Beginning in September, Google employees will split their time at work, spending at least three days a week in the office and the remainder of the work week working from home. Microsoft and Ford have also signaled their intention to implement hybrid working models. Other, smaller businesses are sure to follow these examples. Spoiler alert: the future of work is hybrid.

Use to Instrument Kubernetes with OpenTelemetry & Helm is always looking to improve the user experience when it comes to Kubernetes and monitoring your K8s architecture. We’ve taken another step with that, adding OpenTelemetry instrumentation with Helm charts. We have made Helm charts available before, previously with editions suitable for Metricbeat and for Prometheus operators.

Here's What Software Errors Could be Costing your Business

Software errors are annoying – they are troublesome for IT departments and affect many company processes. A software error is essentially a mismatch between what is expected of the program and the produced output. Sometimes these software errors could have negligible impact, while on other occasions, they could wreak absolute havoc, especially for industries like banking, healthcare, airlines, and stock markets.

How to Leverage IT Automation and Cloud To Put Customers First

In the face of unexpected crises or disruptions, maintaining business continuity has become more important than ever. Last year, businesses around the world had to shift to a remote workforce model overnight. Were their IT departments prepared for this massive shift?

From Ticket-Time to Real-Time: Changing the Status Quo of Operations Work

2020 Was...Rough. Keeping a Digital Business running has never been an easy job, especially over the last year. 2020 forced many businesses to accelerate their digital transformation initiatives faster than anyone imagined! Customers are demanding more capacity and reliability, the business is releasing more new services - faster than ever before, and companies are learning to use new remote working models, straining systems and people.

The four best features to look out for in SQL Monitor

I’m a Data Architect and I’ve been working with data and databases for years at companies like LA Fitness, Dell and now Kingston Technology in Fountain Valley, California. Over all of that time, I’ve used SQL Monitor. I loved it from the beginning and the latest updates to the global overview dashboard and other features have stepped it up another few notches.

Signed Pipelines Build Trust in your Software Supply Chain

Trust isn’t given, it’s earned. As the Russian proverb advises, Доверяй, но проверяй — or as U.S. President Ronald Reagan liked to repeat, “Trust, but verify.” We designed JFrog Pipelines to securely support a large number of teams, applications, users and thousands of pipelines.

Improve Business Productivity

In the past year, Microsoft Teams has become one of the top videoconferencing and telecommunications platforms that have helped keep businesses productive throughout this global pandemic. Pivoting to remote/hybrid work environments in the long term is more easily achieved with a service like Microsoft Teams, which makes ensuring and maintaining optimal service on your end that much more important. Behold the key to help improve business productivity and optimize your Teams performance.

4 ways to digitally transform the customer experience

In an age when customers voice their dissatisfaction by ranting on social media or quietly taking their business to a competitor, it’s more important than ever to keep them engaged with—and coming back to—your brand. At ServiceNow, we’re committed to helping you workflow a better customer experience. Learn practical, attainable ways to engage your customers through these four webinars.

Understanding Load Balancing Essentials

In this post we’ll review some of the essential ideas in Load Balancing to help you understand how to get the best configuration for your application. Load balancing is an essential part of any application deployment to provide high availability, performance and security. We’ll focus on understanding and selecting scheduling and persistence algorithms and using the new LoadMaster Network Telemetry feature to validate the results.

What is Real User Monitoring?

Choosing the appropriate tools and approaches to utilize for application performance management can quickly become confusing. That's why it's important to remember that the ultimate goal of monitoring is to figure out two things: And there may be no better beginning point than incorporating real-user monitoring (RUM) using a performance monitoring solution to get as close as feasible to meet both objectives.

Why choose the Connect license for SquaredUp SCOM dashboards?

Every enterprise has its arsenal of tech tools to tackle an array of different challenges. If you’re using SquaredUp for SCOM, you already have the best dashboard for SCOM data – but what if you could track all the metrics from all your tools on a single dashboard? Typically, to get the full picture of your infrastructure, you need the right toolset to connect to Web API, a SQL database, or through other means.

End-User Monitoring for IT Operations Monitoring

I’ll be the first to admit one of my weaknesses is public speaking. I spend hours before a training session, online seminar, or live event rehearsing exactly what I want to say and how I want to say it. But all my time spent practicing an engaging presentation only mildly prepares me for the moment I’m in front of others and it’s my time to speak.

Turn your home office into a NOC room with Philips Hue and Grafana

I recently got a couple of Philips Hue Play lights to spice up my home office setup, and after a bit of tinkering with the APIs, I decided it would be a fun project to create my own personal NOC room, using them to visualize the status of some system I’m monitoring.

Analyze your logs easier with log field analytics

We know that developers or operators troubleshooting applications and systems have a lot of data to sort through while getting to the root cause of issues. Often there are fields like error response codes that are critical for finding answers and resolving those issues. Today, we’re proud to announce log field analytics in Cloud Logging, a new way to search, filter and understand the structure of your logs so you can find answers faster and easier than ever before.

Top 10 PromQL examples for monitoring Kubernetes

In this article, you will find 10 practical Prometheus query examples for monitoring your Kubernetes cluster. So you are just getting started with Prometheus, and are figuring out how to write PromQL queries. At Sysdig, we’ve got you covered! A while ago, we created a PromQL getting started guide. Now we’ll jump in skipping the theory, directly with some PromQL examples.

How Can Companies Benefit from Observability? | Splunk's Spiros Xanthos & influencer Jo Peterson

Observability – what is it? Until now, the tools IT and DevOps teams have relied on to monitor and manage applications have often been disconnected. With a massive shift to cloud infrastructure, organizations are now wrestling with operational complexity. Leadership must look to solutions that break down silos and offer real-time insights and visibility to decrease time troubleshooting.

Why Does My Database Need Indexes?

Have you ever deployed a new application that ran fine at first, then slowed to crawl as more and more data was added? Or tried to run a report that took minutes or even hours to come back? Database performance is a frequent bottleneck for many applications, and in this post you’ll learn about a critical aspect of database performance—indexes.

Why Elasticsearch is an indispensable component of the Adyen stack

At Adyen, we use Elasticsearch to power various parts of our payments platform. This includes payment search, monitoring, and log search. Let’s take a look at how we use Elastic for these different use cases and see how we capitalize on the power of Elasticsearch. We recently did a talk about some of our Elasticsearch adventures at an Elastic meetup. You can find a recording here.

Debunking 3 Website Availability Monitoring Myths

Debunking 3 Website Availability Monitoring Myths Some myths in life are harmless, or even helpful. For example, Santa Claus has come in very, very handy for parents who want to nudge their kids from the naughty list to the nice one. And let’s give a round of applause to the Tooth Fairy, whose promise of nominal financial compensation has turned the prospect of losing a tooth from a meltdown trigger into a motivational factor.

Securing containers on Amazon ECS Anywhere

Amazon Elastic Container Service (ECS) Anywhere enables you to simply run containers in whatever location makes the most sense for your business – including on-premises. Security is a key concern for organizations shifting to the cloud. Sysdig has validated our Secure DevOps platform with ECS Anywhere, giving AWS customers the security and visibility needed to run containers confidently on the new deployment model.

Using LogDNA To Troubleshoot In Production

In 1946, a moth found its way to a relay of the Mark II computer in the Computation Laboratory where Grace Hopper was employed. Since that time, software engineers and operations specialists have been plagued by “bugs.” In the age of DevOps, we can catch many bugs before they escape into a production environment. Still, occasionally they do, and they can spawn all kinds of unexpected problems when they do.

Using LogDNA and your Logs to QA and Stage

An organization’s logging platform is a critical infrastructure component. Its purpose is to provide comprehensive and relevant information about the system, to specific parties, while it's running or when it's being built. For example, developers would require detailed and accurate logs when building and implementing services locally or in remote environments so that they can test new features.

Using LogDNA to Debug in Development

Developing scalable and reliable applications is a serious business. It requires precision, accuracy, effective teamwork, and convenient tooling. During the software construction phase, developers employ numerous techniques to debug and resolve issues within their programs. One of these techniques is to leverage monitoring and logging libraries to discover how the application behaves in edge cases or under load.

The painful simplicity of context propagation in Go

Context propagation is fundamental distributed tracing and modern observability. We're going to deep dive into how Context management works in OpenTelemetry, using Go as an example. I love programing in Go, and I appreciate the dedication to simplicity and readability. But sometimes "we fear magic" can drift into "we fear cameras will steal our souls." Is the explicit way that Go handles context propagation actually *too* simple?

Reducing flaky test failures

Testing is vital because it helps you discover bugs before you release software, enabling you to deliver a high-quality product to your customers. Sometimes, though, tests are flaky and unreliable. Tests may be unreliable because of newly-written code or external factors. These flaky tests, also known as flappers, fail to produce accurate and consistent results. If your tests are flaky, they cannot help you find (and fix) all your bugs, which negatively impacts user experience.

Debugging Azure Functions Locally

Azure Functions are great for running bits of processing on a trigger without having to worry about hosting. Recently, I needed to debug an Azure Function—I needed to hunt down a particularly evasive bug that wasn’t showing up in the unit and integration tests. As it turns out, debugging an Azure Function isn’t as trivial as simply running the debugger in Visual Studio. Instead, it requires some setup to replicate the environment and configuration typically available in Azure.

3 Reasons Manufacturers Across Asia Pacific & Japan Are Turning to Modern Apps

Manufacturing is more important than ever as governments, businesses, and individuals rely on the industry to drive innovation and economic prosperity through employment and exports, producing both essential and non-essential products that enhance our daily lives.

FireHydrant May 2021 Product Updates: The summer of integrations

With 50% of the US adult population vaccinated, there’s a lot to look forward to this summer, life no longer feels like it’s on hold, and we’re fully embracing that. Get your fire hoses ready, 'cause extinguishing incidents just got easier. We’re rolling out a summer full of new integrations, product releases, events, and more.

Why You Should Switch To A Modern Cloud-Based ITSM Solution.

Is my service desk investment paying off? Is my service desk delivering true business value? Are we incurring costs that are not under control? Are we able to leverage the service desk beyond IT use cases? If your current ITSM solution makes you question its true value and reliability, are you even using the right ITSM tool? We don’t think so! Let’s get one thing straight. We don’t advocate a one-size-fits-all approach.

Rubrik's Customer Trust Portal Puts Transparency First

At Rubrik, we’re committed to delivering the best data management capabilities seamlessly to our customers and partners. Customer Trust was one of the first functions we focused on building, because Rubrik’s customer-first philosophy is founded on the core pillars of transparency and security. Our goal is to deliver the same, seamless service Rubrik expects from its own suppliers and vendors in every customer and partner interaction.

Exoprise Delivers Resilient Digital Experience and Microsoft 365 Visibility to BCD Travel

Founded in 2005 and headquartered in the Netherlands, BCD Travel is a provider of global corporate travel management with offices in more than 100 countries. The company simplifies the complexity of business travel and drives savings for travel and procurement partners. The company's IT department has hundreds of employees worldwide with expertise in managing and supporting infrastructure across North America, Europe, and Asia.
Featured Post

Maintaining Legal Policies While Employees Are Working at Home

Working remotely is an increasingly growing trend in the current work environment, where employees can sign in from anywhere. While some companies consider remote work regularly, others have completely adopted this working model, especially following the recent coronavirus pandemic. Regardless, businesses should develop legal work from home policies that streamline this new working method.

Secure, Simple and Scalable Video Conferencing with Jitsi

The situation with COVID-19 affects not only the private lives of people and their families but also the business, excluding the possibility of face-to-face communication. Many individuals and companies are forced to adapt to work and communicate remotely. Thus video conferencing got a high-level demand. One of the key players in this market is Zoom video conferencing software. However, there are multiple claims regarding Zoom security, confidentiality and data privacy.

The future of Prometheus remote write

At PromCon last month, Tom Wilkie, Grafana Labs VP of Product, described the origin and purpose of Prometheus remote write and previewed exciting developments on the road map. “We covered our efforts to standardize remote write, document how it works and why it works that way, and then test implementations,” Wilkie said. “In the next release or two of Prometheus, we’ll improve how we send metadata via remote write and start sending exemplars.

The top three insights from the 2021 State of Database DevOps report

Last year was a year of unprecedented challenges for everyone in every part of the world and every industry, and it was also a year of big changes in the IT sector. The pandemic underscored the role of the IT department as an enabler and a critical part of the transition to remote working. While digitalization was well underway before 2020, no one could have predicted the acceleration the pandemic brought on.

No going back: COVID-19 is catalyst for digital transformation

As the world begins to emerge from the COVID-19 pandemic, more and more business leaders are focused on preparing for future crises. This is driving a workflow revolution as companies strive to stay both agile and resilient. Against that backdrop, ServiceNow Chief Innovation Officer Dave Wright facilitated a panel of experts from the public sector, healthcare, and technology industries.

Interlink recognized as a Representative Vendor in Gartner Market Guide for AIOps Platforms 2021

Gartner Market Guide for AIOps Platforms, Pankaj Prasad, Padraig Byrne, Josh Chessman, April 6, 2021 Gartner Market Guides are used extensively by end-users building out their vendor shortlists for I&O leaders focused on Infrastructure, Operations and Cloud Management initiatives.

Best Practices to Simplify the Management of Multi-Tenant EKS, AKS, or GKE Clusters

Without a strategy in place, it will introduce a handful of challenges. Platform teams will be unable to do the following: As you’re defining policies for multi-tenant AKS, EKS, or GKE clusters, consider these tips: To help you get started on the right track, we created this cheatsheet for multi-tenancy success.

Announcing the Industry's First Private Distribution Network

Today, at our DevOps user conference swampUP, we were thrilled to announce a new groundbreaking innovation from JFrog: The industry’s first Private Distribution Network! Private Distribution Network (PDN) enables enterprises to easily set up and manage a secure, massively-scalable, hybrid distribution network for software updates.

What's New from JFrog: Binary Lifecycle Management at Scale

JFrog’s annual swampUp DevOps conference always brings new, exciting features to further our vision of accelerating releases through liquid software. This year was no exception, as JFrog CTO Yoav Landman and CPO Dror Bereznitsky revealed innovations for the JFrog DevOps Platform that enable end-to-end binary lifecycle management. Enterprise DevOps and large-scale modern application delivery require robust management of binaries, which are the building blocks of applications.

opEvents Prevents Event Storms During A Snowstorm

I dropped into a quarterly business review that one of the Account Managers was doing with one of our customers last week. I like to do this from time to time to hear it for myself directly from the customer. It helps me understand the customers and gives me an opportunity to discuss our platform post sale and integration.

Cloud Economics 101: Here's What You Need To Know

Businesses are increasingly interested in the economics of cloud computing. For instance, what are the financial implications of moving to the cloud versus staying on-premises? And what’s the best strategy for optimizing cloud consumption to get the best value from cloud resources? This article takes a look at some of the key concepts of cloud economics, and how your business can leverage cloud cost intelligence to maximize the value of your investment.

Adding IaC security scans to your CI pipeline with Indeni

With CircleCI, there are many different CI/CD flows that can be automated. One such flow is the use of Infrastructure-as-Code (IaC) to build cloud environments. For example, you can use CircleCI to automate the process of building Terraform plans and applying them, in order to create massive production setups in AWS, Azure, GCP, and other cloud environments.

Four things to consider when evaluating incident management platforms

When you’re feeling the stress and pain around incidents, making the decision to find an incident management tool is a no-brainer. But how do you choose the one that will work for you, your team, and your business? You might be asking yourself: Where do I start? What do I need to know? What questions do I ask? What are the options? How can I be sure we’re choosing the right tool?

How to Structure Your IT Service Desk to Support Process Improvements

Buying a modern service desk tool won’t solve all your problems in and of itself. Although these tools are designed to ensure general best practices are met, service desk tools must also be configured to address your organization’s specific essential protocols. In a recent Info-Tech study that surveyed 623 organizations worldwide, the most frequently adopted service management processes are.

The Industry's First Private Distribution Network

Private Distribution Network (PDN) enables enterprises to easily set up and manage a secure, massively scalable, hybrid distribution network for software updates. This new innovative technology accelerates software distribution 40X to speed up deployments and concurrent downloads across large-scale environments spanning hybrid infrastructure, edges, and IoT devices. PDN provides two integrated network utilization and acceleration technologies - HTTP-based, secure P2P, and CDN - that can be rolled out across large-scale mixed-infrastructure and multi-tiered, customizable network topologies, and are managed as-a-service with usage-based pricing.

6 Common Pitfalls of AWS Lambda with Kinesis Trigger

This article was written for the Dashbird blog by Maciej Radzikowski, who builds serverless AWS solutions and shares his knowledge on Kinesis Data Streams are the solution for real-time streaming and analytics at scale. As we learned last November, AWS themselves use it internally to keep, well, AWS working. Kinesis works very well with AWS Lambda.

5 Challenges in Chatbot Optimization and Maintenance

Chatbots can be like double-edged swords. It can either boost your customer service or turn customers away. Hence, you must make sure that you research and prepare properly before committing to it. This way, you will know how to optimize and maintain your chatbots to ensure its effectiveness. There are many chatbot benefits for business. In fact, 78% of businesses have started integrating such technology into their customer service in the past months.

Integrating AppSignal With Microsoft Teams

We’re constantly looking for interesting integrations for our performance incidents, exception incidents, anomaly detection and uptime monitoring notifications, and our latest addition is an integration for Microsoft Teams. Microsoft Teams is a hub for team collaboration in Microsoft 365. It integrates people, content, and tools your team needs to be more engaged and effective.

Can you prevent death, taxes and outages? (spoiler alert: not exactly)

MSP may have no control over the outage, such as Internet or cloud service outages. However it is important that their customer expectations are managed. When there is an outage, how an MSP communicates with their customers can have an impact on customer retention. Customers understand there will be outages. What they need to know is what is the impact and when will systems be back to normal.

Using AWS Timestream for System Health Monitoring

Amazon Web Services (AWS) introduced a preview of Timestream in November 2018 before releasing the full version in October 2020. AWS Timestream is a time series database that can process trillions of events daily. It is faster and less costly than relational databases offered by AWS for processing time-series information. In this article, we will look at what Timestream can do compared to some other AWS databases, and how to use Timestream to help monitor the health of your system.

How to achieve acceptance testing through abstraction

Beaker is a Puppet testing harness focused on acceptance testing via interactions between multiple (virtual) machines. It provides platform abstraction between different Systems Under Test (SUTs), and it can also be used as a virtual machine provisioner setting up machines, running any commands on those machines, and then exiting. Recently, Vox Pupuli, a collective of Puppet community authors, has taken over responsibility to care and feed Beaker for its continued widespread community use.

Argo Rollouts, the Kubernetes Progressive Delivery Controller, Reaches 1.0 Milestone

Argo Rollouts, part of the Argo project, recently released their 1.0 version. You can see the changelog and more details on the Github release page. If you are not familiar with Argo Rollouts, it is a Kubernetes Controller that deploys applications on your cluster. It replaces the default rolling-update strategy of Kubernetes with more advanced deployment methods such as blue/green and canary deployments.

The Anatomy of Three Incidents | Randy Shoup on 99 Percent Visible

The best response to a system outage is not "What did you do?", but "What did we learn?" This session walks through three system-wide outages at Google, at Stitch Fix, and at WeWork. In all cases, many things went right and a few went wrong, and after the blameless postmortems, we ended up learning a lot and making substantial improvements in our systems.

Be ready for anything in a world of digital everything

PagerDuty is a digital operations management platform that empowers the right action, when seconds matter. With over 500 integrations and powerful automation capabilities, we make it easy to stay on top of urgent, mission-critical work and keep your digital services always on. For the developers and IT teams working in real-time operations, PagerDuty makes sure you can focus on what matters most. And stay ready for what’s next.

Monitoring and Tuning Open-Source Databases

By continuously running a well-built general-purpose database performance monitoring facility, organizations can gain constant visibility into the availability and responsiveness of their databases and database management systems (DBMSs). When such a tool is equipped with analytics to compare historical metrics against current values, administrators can immediately understand how current values and behaviors stack up against prior averages and typical baselines.

Do You Know Where Your Cloud Is? Understanding Shadow IT

The public cloud has greatly increased the flexibility of businesses everywhere. Need another petabyte of storage? You’re but a few mouse clicks or a couple lines of code away from allocating all those disks with effectively no lead time. At the same time, it makes it easy for business units, a functional organization, or a disgruntled vice president with a corporate card—who may be frustrated with your IT for various reasons.

King & Wood Mallesons CISO relies on Elastic to "spot and identify" security threats

King & Wood Mallesons (KWM) is among the world’s most innovative law firms and is represented by 2,400 lawyers in 28 locations across the globe. The international law firm, based in Australia, helps clients flourish in Asian markets by helping them understand and navigate local challenges and by delivering solutions that provide clients with a competitive advantage.

How to use Cloud Logging to detect security breaches

If your system's security has been breached, what can you do to stop this attack and not make the situation worse? In this episode of Cloud Security Basics, we show how you can use Cloud Operations Suite to check for security breaches. Watch to learn some best practices when dealing with and handling malicious attacks!

Why the role of the CIO is constantly changing and challenging

Back in the days, the role of the CIO was relatively clear: the focus was on deploying, managing, and maintaining IT systems across the organization. The CIO’s responsibilities started to blur when end-users became more tech-savvy - around the millenium. Reasons were that ‘they can now get their own technology and don’t need IT to do it for them’. This even led to the much-repeated “death of the CIO meme”.

Crazy Like a Fox: Redis as Your Primary Database

Redis is fast. It’s fast because the data is all in memory. Persistence options are limited. Because of this, many people say, “Redis is for transient data only!” However, sometimes the need for speed and ease of operations can outweigh the durability downsides! In this talk, we look at a real SaaS business using Redis as its (only) datastore. You’ll learn why we decided to go all-in on Redis and the challenges we faced. You’ll learn how we operationalized the setup, handle backups and restores, and how we’ll scale out. Are we making a terrible mistake? You be the judge!

Automate your IT routine with OpManager's Workflow feature

Performing day-to-day IT tasks can be demanding—not because all tasks are challenging to carryout, but because of the repetitive nature of many tasks. A high number of mundane, repetitive tasks impacts productivity. Over time, these repetitive, no-brainer tasks can even eat away so much valuable time that it effectively halts your organization’s growth.

Featured Post

Geek Pride Day: SolarWinds Head Geeks reflect on what it means to be a geek

The Oxford English Dictionary defines a geek as a person who is boring, wears clothes that are not fashionable, doesn't know how to behave in social situations. Talk about negative stereotyping! Being a geek is about so much more than (a lack of) fashion, being dull, and an alleged social ineptitude. Being a geek is something inherent, and as some of our Head Geeks will let you know, is saved for those who are truly passionate about the nichest of hobbies and interests.

Types of Cryptography Attacks

Cryptography is an essential act of hiding information in transit to ensure that only the receiver can view it. IT experts achieve this by encoding information before sending out and decoding it on the receiver's end. Using an algorithm, IT experts can encrypt information using either symmetric or asymmetric encryption. However, like any other computer system, attackers can launch attacks on cryptosystems.

Experience Elasticsearch from the Microsoft Azure portal

We are excited to share the latest development in our ongoing partnership with Microsoft. Available in public preview, you can now find, deploy, and manage Elasticsearch from within the Azure portal. Bring powerful enterprise search, observability, and security capabilities to your Azure environment with a user interface and tools that are already familiar to you.

Epsagon's support for AWS Lambda Extensions improves developer efficiency

Epsagon now supports AWS Lambda Extensions. Lambda Extensions enable users to integrate complex tools without complex installation. This greatly simplifies the integration of Epsagon in Lambda environments, thus reducing operational overheads. Our layers are publicly available for Python and Node.js Lambda runtimes. With just a few Environment Variables, the layer will automatically add tracing to your functions.

5 Strategies for Safeguarding your Kubernetes Security

Since Google first introduced Kubernetes, it’s become one of the most popular DevOps platforms on the market. Unfortunately, increasingly widespread usage has made Kubernetes a growing target for hackers. To illustrate the scale of the problem, a Stackrox report found that over 90% of respondents had experienced some form of security breach in 2020. These breaches were due primarily to poorly-implemented Kubernetes security.

Monitor Your InfluxDB Open Source Instances with InfluxDB Cloud

Everyone says the cloud is the future. Sure, but try telling that to someone who has terabytes of sensitive data stored in an on-prem InfluxDB Open Source (OSS) instance, and they will bring up a whole set of reasons why it doesn’t make sense for them to move into the cloud right now. There are also some use cases which make more sense for on-prem software deployments.

Hybrid Work Productivity: A Decision-Making Guide

The brisk pace of COVID-19 vaccination is changing how companies think about the future workplace. A recent Cushman & Wakefield survey of 40,000 employers revealed a murkier picture than three months ago, with the spectrum now including two flavors of hybrid work: “remote-first hybrid” and “office-first hybrid.” Some companies have set aggressive return-to-office deadlines.

Finding the Bug in the Haystack: Correlating Exceptions with Deployments

You’re called in. The system is misbehaving. It could be a key metric going crazy, or exceptions starting to fire. You’re troubleshooting, beating around the bush, just to realize that one of the team’s deployments was the one messing things up. Sounds familiar? If you’re practicing continuous deployment, you probably experience that several times a week, if not more. Users report that 50% of their outages are due to infrastructure and code changes, namely deployments.

What's new in Sysdig - May 2021

Welcome to another monthly update on what’s new from Sysdig. Eid Mubarak! Our team continues to work hard to bring great new features to all of our customers, automatically and for free! Most importantly, of course, was our recent funding round! I won’t repeat all the details as you can read more about what it means here. However, we are super excited about all the new feature improvements we can fund and bring to our customers!

What do site reliability engineers do?

Are you considering adopting SRE? We will explain the roles and responsibilities of an SRE team within your organization, and how to start building one. So what does an SRE team do? An SRE team is responsible for building software that improves the resiliency of systems, implementing fixes, responding to incidents, and automating processes whenever possible. Site reliability engineering is a holistic practice that incorporates various types of work.

Overcoming Database DevOps Challenges: Part 1

As part of our research for the 2021 State of Database DevOps report, we asked 3,000+ recipients what they consider to be the greatest challenge when integrating database changes into a DevOps process. According to the respondents, these are the most important challenges facing database professionals when introducing DevOps practices to database development.

Icinga Module for JIRA v1.1.0

If your team is using Atlassians Jira and Icinga and you didn’t know about our integration yet: Our module for Jira is now at version 1.1.0 with a bunch of bugfixes and new features that were requested on the GitHub repository. Our friends from the internezzo ag helped out by sponsoring the development as well – a big THANK YOU to them!

Redfin Implements Circonus to Scale its Monitoring, Reduce Costs, and Improve Accuracy of StatsD Analysis

Over 90% of Redfin’s metric data will be represented in Circonus’ log linear OpenHistograms, which will reduce their metric footprint by 50-60%. We’re pleased to announce today that Redfin, the technology-powered real estate brokerage, has selected Circonus to replace its existing metrics platform.

What Is Container Orchestration?

Since the revolutionization of the concept by Docker in 2013, containers have become a mainstay in application development. Their speed and resource efficiency make them ideal for a DevOps environment as they allow developers to run software faster and more reliably, no matter where it is deployed. With containerization, it’s possible to move and scale several applications across clouds and data centers. However, this scalability can eventually become an operational challenge.

New Executive Order Forces Federal Agencies to Rethink Log Management

Last week President Biden issued a widely publicized executive order to improve cybersecurity and protect federal government networks. The order comes in the wake of several prominent attacks against public-sector and private-sector infrastructure in recent months including last week’s Colonial Pipeline ransomware attack that disrupted fuel supplies and triggered gasoline shortages in the Southeast.

Is SquaredUp Dashboard Server an easy alternative to Grafana?

Grafana is free and powerful - a mainstay in DevOps and IT dashboarding. It’s an open-source visualization platform that lets you visualize data in real-time from almost any database. SquaredUp Dashboard Server is, at first glance, quite similar! You can dashboard just about any data to get real-time visualizations. All for free. So… The answer is… Yes, if you want enterprise scale dashboards hosted on Windows Server that are super-fast to set up (and usable by anyone).

Kubernetes automation with Relay

Kubernetes — a popular open source container orchestration system — enables you to easily deploy, monitor, and scale cloud-native application workloads in both private and public cloud environments. In other words, Kubernetes does the hard work of managing containerized applications, giving you more time to spend building it.

Adventures in Observability with ClickHouse

How in-house ClickHouse deployment enabled Instana to build better monitoring for users Instana is an Application Performance Monitoring solution that provides complete visibility into complex distributed systems, including ClickHouse servers. In this webinar we welcome Marcel Birkner and Yoann Buch of Instana, who describe how the Instana team uses ClickHouse to power their solution, lessons they have learned, and how they integrated them back into Instana’s own ClickHouse monitoring features.

Version EVERYTHING: the Journey of Altissia into GitOps

In this fireside chat, Gregory Schiano, CTO of Altissia and Michele Mancioppi, Technical Product Manager at Instana, discuss how Altissia applies GitOps to the entire development process. Learn how Altissia leverages Git to manage their CI/CD, infrastructure deployment, application deployment, release management and APM configuration management. And why they automate as much as possible along the way.

Building a Culture of Observability in Your DevOps Team

As production environments become more distributed and ephemeral, it becomes increasingly difficult for DevOps and SecOps teams to understand their systems’ availability and performance. Despite the proliferation of monitoring tools on the market today, obtaining real-time visibility has never been more challenging.

Stop Worrying on Serverless and Learn to Love Pipeline Feedback

AWS Lambda is one of the defining technologies of the cloud-native shift in software development of the last few years. AWS Lambda empowers developers to quickly push code and immediately change the behavior of their applications. With that change, problems often come that need to be detected and solved just as quickly. Instana is the premier observability tool for cloud-native applications and has best-in-class AWS Lambda support. With Instana, detecting issues creeping into your Lambdas with new deployments happens in real-time with virtually no setup.

From Distributed Tracing to Logs and Back: How Decisiv Troubleshoots Issues

Observability in todays’ DevOps world builds on top of three pillars: logs, metrics and distributed tracing. Decisiv uses Instana and Humio to detect and troubleshoot issues in their production system in a matter of minutes. In this fire-side chat, Hunter Madison, Senior SRE at Decisiv, Michele Mancioppi, Product Manager at Instana and James Mountifield, Solution Architect Lead at Humio, will have a candid, 360 degrees chat on observability.

How to deploy and manage Elastic on Microsoft Azure

We recently announced that users can find, deploy, and manage Elasticsearch from within the Azure portal. This new integration provides a simplified onboarding experience, all with the Azure portal and tooling you already know, so you can easily deploy Elastic without having to sign up for an external service or configure billing information.

How one mobile company is using Grafana Enterprise for billing system observability and beyond

Calling or texting with a mobile phone may seem like a simple process, but behind the scenes, network providers are engaged in a constant exchange of transactions to pay each other for connecting their customers. If telecom companies don’t stay on top of the data and billing, they could be surprised with their own big bills at the end of each month. Cosmote, the largest mobile network in Greece, handles the challenge by using Grafana Enterprise.

End-User Monitoring: Best Practices and Tools

Poor application performance, besides being a sign of potential problems, is a strong predictor of unhappy users—and unhappy users are likely to become former customers. So software organizations are always searching for ways to improve the performance of their applications. One of the most effective of such ways to improve performance is obtaining visibility of your app’s behavior—which is something that can be achieved through monitoring.

Observability won't replace monitoring

Recorded for the online CTO Summit on Tuesday, May 18th, 2021 on improving observability Lightstep’s observability platform is the easiest way for developers and SREs to monitor health and respond to changes in cloud-native applications. Powered by cutting-edge distributed tracing and a groundbreaking metrics database, and built by the team that launched observability at Google, Lightstep’s Change Intelligence provides actionable insights to help teams answer the question “What caused that change?”

What are Prometheus Functions?

Prometheus is a platform for real-time systems and event monitoring and alerting. The Prometheus project is free, open-source, and available on GitHub. Originally developed at SoundCloud, Prometheus became a project of the Cloud Native Computing Foundation in 2016, alongside other popular frameworks such as Kubernetes. The core of the project is the Prometheus server, which acts as the system’s “brain” by collecting various metrics and storing them in a time-series database.

What is Prometheus Pushgateway?

Prometheus is a free and open-source software for real-time systems and event monitoring and alerting. Originally developed at SoundCloud, Prometheus became a project of the Cloud Native Computing Foundation in 2016, alongside other popular frameworks such as Kubernetes. To start using Prometheus, you’ll need a solid understanding of all of the tool’s functionality.

Using Audit Logs For Security and Compliance

Developers, network specialists, system administrators, and even IT helpdesk use audit log in their jobs. It’s an integral part of maintaining security and compliance. It can even be used as a diagnostic tool for error resolution. With cybersecurity threats looming more than ever before, audit logs gained even more importance in monitoring. Before we get to how you can use audit logs for security and compliance, let’s take a moment to really understand what they are and what they can do.

Elastic 7.13.0 released: Search and store more data on Elastic

We are pleased to announce the general availability (GA) of Elastic 7.13. This release brings a broad set of new capabilities to our Elastic Enterprise Search, Observability, and Security solutions, which are built into the Elastic Stack — Elasticsearch and Kibana. This release enables customers to search petabytes of data in minutes cost-effectively by leveraging searchable snapshots and the new frozen tier.

Top 15 Kubernetes Resources

While Kubernetes is a very powerful and comprehensive application, it can also be very complicated and confusing to new users. Thankfully, the community is great at pulling together to try to tame the Kubernetes beasts, and as more users join the platform, more handy tools to help you manage your cluster are developed. Kubernetes Resources range from everyday helper tools to development tools to troubleshooting tools, and in this article we’ll discuss fifteen of the best ones.

What Is APM?

Suppose your website’s sales volume per hour suddenly drops. Something’s wrong. You also notice a fluctuation in the time it takes for a customer to add the last item to their cart and finish checkout. In this time, they enter payment details, log in to a payment portal, and finalize the purchase. This takes, on average, four minutes. However, this number has suddenly spiked three-fold to 12 minutes. Something’s definitely wrong.

Blameless Runbook Documentation is Now Generally Available!

At Blameless, our mission is to provide teams with the tools they need to operationalize SRE and embrace a culture of resilience. We help teams automate toil and adopt best practices across integrated incident management, comprehensive retrospectives, service level objectives, reliability insights, and more. We are very excited to announce that Blameless Runbook Documentation is now generally available for all customers.

Is SquaredUp Dashboard Server the effortless alternative to Grafana?

Grafana is free and powerful - a mainstay in DevOps and IT dashboarding. It’s an open-source visualization platform that lets you visualize data in real-time from a variety of data sources. SquaredUp Dashboard Server is, at first glance, quite similar! You can dashboard just about any data to get real-time visualizations. All for free.

Do you already know what Active Directory is and how to use it with Pandora FMS?

As you may already know, in this blog, we’re so into answering the big questions. After answering in previous episodes what the meaning of our existence is or explaining everything you need to know about Office 365 Monitoring, in today’s episode we are going to discuss what Active Directory is. I hope you are very comfortable sitting in your respective gamer chairs or in your two-seater sofas, because here we go!

3 Steps to Optimize Collaboration Solutions and Drive Adoption Today

It’s hard to imagine our work lives without collaboration tools. Whether you attend Zoom meetings or brainstorm projects (and send the occasional humorous GIF) on Slack, these solutions have become foundational elements of the workplace – even moreso in recent times, when most workplaces became more digital than ever before.

The What and The Why of TLS Inspection

Connecting to nearly any web page today, you’re more often to see a URL that begins with “https://” instead of “http://”. Wondered what the “S” is for? It stands for “secure”, but more importantly, it identifies that the connection is taking place over a secure channel using the Transport Layer Security (TLS) protocol. But what is TLS, and beyond that, what’s a TLS inspection?

New Monitoring Capabilities for IBM Middleware Added to IBM Observability by Instana

Instana provides businesses with advanced application performance monitoring and observability capabilities, manages the performance of complex applications and software no matter where they reside and accelerates the efficiency of IT operations teams, development teams and DevOps teams. IBM Observability by Instana can be integrated with the IBM Cloud Paks (e.g., IBM Cloud Pak® for Integration and IBM Cloud Pak® for Watson AIOps).

Aternity Digital Experience Index (DXI) for Continuous Service Improvement

Only Aternity DXI enables you to tailor your digital experience goals based on industry benchmarks, instantly associate performance gaps to lost productivity or revenue, and drill into the worst performing areas for root cause analysis and rapid remediation.

Setting Up Your Service Desk - The People, Process, and Technology

The service desk acts as the primary support mechanism in organizations, managing customer contacts for assistance and access to services. Their primary purpose includes the following: The service desk can manage their daily challenges in many ways. Still, the design and architecture of delivering services and a set of robust communication channels, powered by sufficient automation, collectively help the service desk excel and provide an excellent customer/consumer experience.

Monitoring Cloud Environments at Scale with Prometheus and Thanos

In Mattermost, our monitoring solution is continuously evolving to meet our scaling infrastructure needs. Our previous architecture used Prometheus federation and was perfect for our small/medium infrastructure size, but was not able to scale in the way we needed. This post will explain how we used Thanos and the Prometheus operator to scale our monitoring infrastructure and meet our long-term storage needs.

Building a complete network security checklist

Understanding what to audit in a network can be chaotic and confusing. Building a complete network security checklist is crucial for organizations with computers connected to the internet or to each other. Think of it like an antivirus scan you might run on your computer to find Trojans or malware, except you’re scanning your entire network to find anything that may cripple it.

Introducing Datadog's Lambda extension

AWS Lambda extensions enable you to seamlessly integrate third-party tooling with your Lambda environment so you can run custom code or monitoring agents alongside your functions. We’ve partnered with AWS to create a Lambda extension that offers a more cost-effective, simplified process for collecting data from your functions.

How to Improve IT Service Desk Performance

IT Service Desks are developed as an efficient mechanism of delivering quick help whenever your employees need it. But as businesses grow, so does the request volume and the pressure on the technicians to attend every call and resolve a ticket. And with the onset of the work-from-home necessity induced by the COVID-19 pandemic, IT teams are struggling to efficiently support a remote workforce with minimum resources.

How to Manage Network Configurations + Best Software to Automate Configuration Management

Businesses both small and large require agility in how they handle device firmware and network configurations. For large networks especially, manual monitoring and change implementation can be inefficient. As such, more and more IT departments turn to automated network configuration management tools with bulk change capabilities.

The Importance of Log Management - Guide & Best Practices

Log management encompasses the processes of managing this trove of computer-generated event log data, including: There are two ways that IT teams typically approach event log management. Using a log management tool, you can filter and discard events you don’t need, only gathering relevant information – eliminating noise and redundancy at the point of ingestion.

How to Manage Your Monitoring with Subaccounts

The need to implement 360° monitoring of a multi-service infrastructure is almost a universal truth among growing companies. With an expanding pool of clients and services to monitor, segmentation is the key to smooth operation. Monitoring with subaccounts is prime management solution. The trick is to simplify your account structure without limiting your visibility. To evaluate, we’re going to dive to a cellular level. Size matters.

Visualize HAProxy Metrics with InfluxDB

HAProxy generates over a hundred metrics to give you a nearly real-time view of the state of your load balancers and the services they proxy, but to get the most from this data, you need a way to visualize it. InfluxData’s InfluxDB suite of applications takes the many discrete data points that make up HAProxy metrics and turns them into time-series data, which is then collected and graphed, giving you insight into the workings of your systems and services.

June 2021 Civo Roadmap Update

In October 2020 we released the community-driven roadmap for 2021. It's time to revisit and see all the things we have completed from the list! I am very proud to say that at Civo we have taken the community suggestions and implemented most of them during the launch on May 4th 2021. Let's dive into each of the features listed in the original blog post and see where we are with the 2021 Civo Roadmap.

Understanding The Move To Intelligent Networking

More CIOs are seeing the value of network automation, which can improve network efficiency and cost, as well as help them manage increasingly complex IT environments. But like the adoption of any new technology, network automation presents a number of challenges and considerations for CIOs. In our recent webinar, PCCW Global’s CTO Paul Gampe and VP of Development and Operations Jay Turner shared some tips and insights into how to begin the move to intelligent networking…

Kristina Robinson | Understand and Visualize Your Data with InfluxDB Cloud | InfluxDays EMEA 2021

Learn how you as a developer can use our InfluxDB Cloud web interface to ingest, explore, analyze, and understand your data. We highlight new capabilities and show you some tips and tricks to get the most out of the InfluxDB Cloud Platform.

How to Consolidate OSS Data into a Cloud Account

In this post, we will describe a simple way to share data from multiple InfluxDB 2.0 OSS instances with a central cloud account. This is something that community members have asked for when they have OSS running at different locations, but then they want to be able to visualize some of the data or even alert on the data in a central place. Please note that while the method presented here is simple and fast to set up, it has many limitations which may make it inappropriate for your product use case.

ITSM Buyers' Guide: 7 Use Cases to Define Your ITSM Goals

Attempting an upgrade or switch to a new ITSM tool is obstacle-ridden for IT directors. From having to address fears surrounding the cost of switching vendors to assessing service management maturity, building a case around why and how an ITSM can advance the business can be a harrowing feat. Thankfully, Info-Tech pulled together this selection guide.

New plugins connect almost all of Redis for monitoring and visualization in Grafana

Mikhail Volkov is building observability and monitoring solutions at Volkov Labs and leading Redis plugins for Grafana. Since the Redis project first got underway in 2009, the open source in-memory data store has been embraced by thousands of companies of all types and sizes. According to, well over 5,000 companies use Redis, including Uber, Airbnb, Twitter, Instagram, and Slack.

Why Midsized SecOps Teams Should Consider Security Log Analytics Instead of Security and Information Event Management

If Ben Franklin lived today, he would add cyber threats to his shortlist of life’s certainties. For decades, bad guys have inflicted malware, theft, espionage, and other forms of digital pain on citizens of the modern world. They seek money, celebrity, and political secrets, and often get them. In 2020, hackers halted trading on the New Zealand stock exchange with a distributed denial of service (DDoS) attack.

Detecting and Mitigating CVE-2021-25737: EndpointSlice validation enables host network hijack

The CVE-2021-25737 low-level vulnerability has been found in Kubernetes kube-apiserver where an authorized user could redirect pod traffic to private networks on a Node. The kube-apiserver affected are: By exploiting the vulnerability, adversaries could be able to redirect pod traffic even though Kubernetes already prevents creation of Endpoint IPs in the localhost or link-local range.

No-code AWS Lambda Monitoring

Auto-instrumenting AWS Lambda Monitoring didn’t originate through a focus group or business plan. It started as a hackathon project that addressed the tedium of removing manual code instrumentation. Developer environments often include hundreds of AWS Lambda functions. And our existing instrumentation required initialization code to be manually placed on every single function.

Advanced Link Analysis, Part 3 - Visualizing Trillion Events, One Insight at a Time

This is Part 3 of the Advanced Link Analysis series, which showcases the interactive visualization of advanced link analysis with Splunk partner, SigBay. The biggest challenge for any data analytics solution is how it can handle huge amounts of data for demanding business users. This also puts pressure on data visualization tools. This is because a data visualization tool is expected to represent reasonably large amounts of data in an intelligent, understandable and interactive manner.

Benefits and challenges of using monorepo development practices

In a single, monolithic repository, also known as a monorepo, you keep all your application and microservice code in the same source code repository (usually Git). Typically, teams split the code of various app components into subfolders and use Git workflow for new features or bug fixes. This approach is natural for most applications or systems developed using a monolithic architecture. Code in such a monorepo typically has a single build pipeline that produces the application executable.

7 DevOps Best Practices You Should Be Following Now

Software engineering teams are always looking for ways to improve the software development process and boost efficiency. One common strategy is to use DevOps, an engineering practice that merges development and operations. In traditional engineering organizations, development and operations teams are often siloed, a scenario that can lead to friction between these two important arms.

What's New in AWS Lambda Extensions

Last year, AWS announced Lambda Extensions to improve the customer experience of integrating tools with their Lambda functions. This new concept allowed customers to plug their favorite tools for monitoring, logging, or security, without worrying about maintenance and overhead. Thundra, being a player in the serverless ecosystem, was one of the AWS Lambda Ready Partners and early adopters of Lambda extensions.

Easily Debug Your AWS Lambda Functions With Honeycomb

With the Honeycomb extension for AWS Lambda, you no longer need to make your Lambda functions Honeycomb-aware. Today, AWS announced the general availability of AWS Lambda Extensions, which make it easy for us to send logs from your Lambda functions directly to Honeycomb. In October, we announced Honeycomb’s extension for AWS Lambda as part of a preview launch. Today, we’re pleased to announce everyone can now use this extension to easily debug their AWS Lambda functions with Honeycomb.

New Customer Service Ops Guide: Introducing Full-Case Ownership

In the world of digital transformation, keeping the focus on the customer experience is paramount. Systems are complex and increasingly distributed, which makes it difficult to stay on top of things when something goes wrong. Customer service teams are the gateway to the customer, and more often than not they are the first line of defense when something goes wrong. The role of customer service teams is critical to maintaining and exceeding customer expectations.

IT Operations Monitoring Looks Ahead

To drive a competitive edge today, organizations are quickly prioritizing their digital transformation and digital experience. For IT operations, this means continuous technical innovations with specific, monumental impact on the way things work. IT complexity in organizations will continue to increase requiring a transformative approach to log management.

How to Monitor CPU Memory and Disk Usage in Java

In this post, we will discuss some of the primary commands, tools, and techniques that could help to monitor CPU Memory and Disk Usage in Java. The Java tools observe Java bytecode constructs and processes. Java Profilers follow all system commands and processor usage. This lets you look at call arrangement at whatever point you prefer.

Securing the new AWS App Runner service

In its mission to simplify building and running cloud-native applications for users, Amazon has announced the GA of AWS App Runner, a new purpose-built container application service. With security top of mind for most organizations shifting to the cloud, Sysdig has collaborated with AWS to enable threat detection for the new platform.

High Performance Images: 2022 Guide

Images engage users, drive clicks, and generally make everything better–except performance. Images are giant blobs of bytes that are usually the slowest part of your website. This 2021 guide has everything you need to know for fast images on the web. Images are big. Really big. The bytes required for an image dwarf most site’s CSS and JavaScript assets. Slow images will damage your Core Web Vitals, impacting your SEO and costing you traffic.

Is Distributed Tracing Really a Big Deal ?

Microservice architectures are everywhere these days. Even internal enterprise applications—which have typically been structured as self-contained monoliths—are now being designed using a microservices architecture. There are definite advantages to a microservices architecture. Breaking an application into discrete, independent chunks—basically mini apps—gives you enormous flexibility. But this flexibility dramatically increases complexity, especially when things go wrong.

What Is the Database Server Doing?

One of the most common questions database professionals are asked by their systems and virtual machine (VM) administrators is “Why does the database server need so much memory?” You’ll get a more detailed answer to that question later in this post, but it’s important to understand a database engine is almost like a server within a server.

An IT Service Desk Can Help Government Agencies Run More Efficiently

IT is ultimately about providing services to end users, and the IT help desk plays a critical role in this effort. As a result of the COVID-19 pandemic, there was a major uptick in the number of inquiries fielded through the IT help desk in 2020—and this trend will extend into 2021 and beyond.

5 Steps to Starting DevOps with a JFrog Free Subscription

The JFrog Free subscription is a SaaS cloud offering of the JFrog DevOps Platform that provides software developers, DevOps Engineers, System Administrators and students a sandbox environment to explore solutions to common DevOps challenges. Here are examples of common DevOps challenges, where having a free subscription to the JFrog Platform helps.

The Role of the DBA Is Changing

For good or for ill, technology is constantly shifting and with it, the roles of those who manage that technology also shift. This is no different for a DBA than it is for a developer, an admin, or analyst. As new technology, like the adoption of the cloud, changes the role, people start to question whether or not there’s even a need for a DBA. The shortest possible answer to that question, in my opinion, is “Yes”.

Why Logging Matters Throughout the Software Development Life Cycle (SDLC)

There are multiple phases in the software development process that need to be completed before the software can be released into production. Those phases, which are typically iterative, are part of what we call the software development life cycle, or SDLC. During this cycle, developers and software analysts also aim to satisfy nonfunctional requirements like reliability, maintainability, and performance.

What Does Digital Ops Mean? A Discussion With The Experts.

OpsRamp recently conducted a survey on the State of Digital Operations Management in 2021 to understand IT investments in 2021, factors hindering organizational innovation and the steps IT leaders are taking to unleash creativity and growth across the organization. We discussed the survey on a webinar featuring OpsRamp Chief Revenue Officer Sheen Khoury and Isaac Sacolick, president of digital transformation consultancy StarCIO. Here are the key highlights of the conversation.

Accelerating Monitored AWS Lambda Functions with Instana Lambda Extensions

We are pleased to announce the general availability of the Instana Lambda Extension. Our extension offers modification-free, low latency tracing of Lambda serverless functions backed by our real time Enterprise Observability platform. This work improves upon our existing AWS Lambda tracing to greatly decrease the latency for short lived functions.

Introducing the First Integration of Instana's Enterprise Observability Platform with IBM Watson AIOps

It’s just been a few months since IBM acquired Instana, but we’ve been working hard to make the world of AIOps and Observability a better place. Recently, the two sides of the equation became a little bit closer with the release of some new integration points. Instana and IBM have integrated Instana’s industry leading Enterprise Observability Platform for cloud-native microservice applications with IBM Watson AIOps.

Discover Everbridge Digital Wayfinding for Higher Education

Creating a positive visitor experience is a key component of the administrative health of a school. Despite advances in technology, campus visits have remained mostly formulaic. Digital Wayfinding takes mobile mapping technology the public is used to and applies it to your school, creating an easy-to-use, attractive, interactive tool for your visitors.

Introducing the New Rollbar Integration for GitHub Enterprise Server

We’re excited to launch our new integration with GitHub that supports GitHub Enterprise Server customers. This allows companies using GitHub Enterprise on their own domains to access key features in Rollbar that help developers fix errors faster. GitHub Enterprise offers a fully integrated development platform for organizations to accelerate software innovation and secure delivery. With Rollbar, GitHub Enterprise Server customers can now access.

FlashDrive and Chia cryptocurrency

Chia cryptocurrency is based on Proof Of Space, and distribute tokens according to a mechanism called plotting. In the last weeks, we've seen a lot of new accounts trying to launch and operate Chia miners from FlashDrive's infrastructure. Most of those accounts where created with fake/stolen credit cards for the sole purpose of getting Chia coins for free.

Use the improved infrastructure list to track your hosts' health

Datadog’s infrastructure list provides a central, high-level view of every host in your environment and pulls together metadata and relevant metrics from across Datadog to help you get the full picture of each one. You can easily filter and sort the list using any host tags, letting you quickly view the status of the parts of your infrastructure you need.

15 Ways to Use the HTTP(S) Check Effectively checks can test anytime, from anywhere, to catch the downtime incidents you need caught. With worldwide probes, or through private locations that monitor your internal network, we reliably detect outages and monitor performance across your websites, applications, servers and infrastructure. Read on to explore 15 use cases for the HTTP(S) check type. HTTP(S) checks validate if a server is up or down, while reducing the possibility of false positives.

How Website Monitoring Can Help Improve the End-User Experience

Making your customers happy is essential in any industry, but it’s imperative for online businesses because the competition is only a few clicks away. If you want your customers to be satisfied, providing them a great user experience is essential. This is easier said than done, however. There are many approaches for organizations wanting to improve their user experience, and picking the right one can become overwhelming. We’re here to help.

New Integration: Declare FireHydrant Incidents from Checkly Alerts

Streamlining your incident management process is what we do best, and one of the ways we do that is by acting as the connective tissue across all of your applications. We’ve partnered with Checkly to bring you a new integration that empowers you to detect problems and resolve incidents faster.

Data Warehouse Vs. Data Lake (Vs. Data Mart): A Full Breakdown

Big data analytics help organizations use data to explore both new and improvement opportunities. Whichever cloud data platform you choose, there are two data storage technologies you will want to understand. Data warehouses and data lakes are the two dominant data solutions commonly used for defining how an organization stores, queries, analyzes, and reports on big data. This post will define what a data warehouse and data lake are, how they work, and their differences.

ITIL 4 High-velocity IT: How High Velocity Organizations Enable Resilience & Anti-fragility

During this period of global crisis and disruption, organizations depend on their IT departments and suppliers to deploy digital collaboration and management platforms at increasing speed while at the same time requiring stable operations, increased availability, and security.

How to install OverOps' Java agent & collector on a Mac; or run collector in a Docker container

Daniel Bechtel, Director Global Support at OverOps, demonstrates how to install the OverOps’ Java agent and collector on a Mac and how the collector can be run in a Docker container. Java agent works on 64 bit hardware only Java agent supports Java 8 - Java 11 OverOps for Java on Mac. This article walks you through the process of installing OverOps on your laptop or local PC Java application using macOS.

InfluxDB OSS and Enterprise Roadmap Update from InfluxDays EMEA

Since the initial release of InfluxDB OSS 2.0 in November 2020, more than 10% of the community has successfully upgraded, and the pace of the upgrades continues at a steady rate. We have released a number of maintenance releases to address defects, expand platform coverage, and enhance the update experience based on feedback.

Designing a Parquet Catalog for InfluxDB IOx

One of the things we needed to either adopt or build for InfluxDB IOx is a database catalog. If you haven’t heard us talk about it yet, InfluxDB IOx (pronounced eye-ox) is the new in-memory columnar database that uses object storage for persistence. We’re building it as the future core of InfluxDB. A database catalog usually contains the definitions of a database’s structure like schema and indexes.

A bittersweet anniversary for eG Innovations

For me, it’s been 20 years since eG entered the US marketplace-at that time, I was one of their first customers and have remained close to them ever since. For a company to survive two decades in an unbelievably complex and competitive performance monitoring landscape is no small feat, and I believe we are still a ‘gem’ in an often confused and fragmented marketplace.

The 5 most shocking websites to go down in May

As May draws to a close, it’s that time again to share with you the most surprising websites that went down this month. Now, I don’t like to “out” anyone but I do like to use these to emphasise that website downtime can affect any website, big or small. And ultimately, the impact is the same – potential customers gone elsewhere, higher bounce rate, lower SEO, and worst of all, lost revenue.

What Does Modern Infrastructure Include and How Do You Monitor It?

Understanding modern software applications isn’t just a question of what; it’s also a question of why. Why do we choose to use a particular technology? How does that technology serve the overall business needs? And when you have a problem, how do you figure out what’s wrong? If you’re in the position of trying to understand a modern software application for the first time, these questions can seem unanswerable.

Key Multi-tenancy Challenges in the Public Cloud and How to Solve for Them

Nobody wants to deal with annoying neighbors. Whether it’s the neighbor who always knows everyone’s business or the one who turns up their music late at night, both types of neighbors can have a negative impact on your living environment and daily life. Obnoxious neighbors aren't exclusive to just your physical living space, but in the public cloud where there are multiple Kubernetes clusters (EKS, AKS, or GKE) and multiple users (or tenants) with the need for cluster access.

Tips for Application Troubleshooting

It is easier to perform application troubleshooting when you know that protocols are in place. For instance, knowing the core features of the application and how the application functions is already a standard. Also, you’ll need to expand the coverage like the requirements of Quality of Service (QoS). Does the application need real-time performance or does it need to move a lot of data? Are there sub-applications running on the endpoints?

Continuous deployment for Android libraries to Maven Central with Gradle

This article will take you through setting up CI/CD integration for building, testing, and publishing libraries to Maven Central using Gradle. With jCenter shutting down, Maven Central is once again the primary destination for all Android and Java libraries. Library publishers will need to port their libraries over to Maven Central to keep their libraries available after jCenter shuts down. This article focuses on CI/CD integration.

Avoid These 4 Common Mistakes When Setting and Measuring Latency SLOs

Setting and measuring latency Service Level Objectives (SLOs) is a critical responsibility for engineers monitoring the performance and health of their applications and systems. SLOs are an agreement on an acceptable level of availability and performance and are key to helping engineers properly balance risk and innovation.

7 Best IT Monitoring Tools and Software of 2021

Monitoring tools, also known as observability solutions, are designed to track the status of critical IT applications, networks, infrastructures, websites and more. The best IT monitoring tools quickly detect problems in resources and alert the right respondents to resolve the critical issues. Response teams use observability solutions to gain real-time insights into resource availability, stability and performance.

Single Sign-On Now Available on OnPage Enterprise-Level Accounts

Single sign-on (SSO) services provide a unified view into applications, logins and devices through a secure identity cloud. SSO allows users to access SaaS-based applications through one simple login process. We, at OnPage, are excited to announce that we’ve extended our integration catalog to include SSO services like Okta and OneLogin. Through a single sign-on process, OnPage enterprise-level users can access the OnPage dashboard from their Okta and OneLogin accounts.

The 30th Anniversary of RSA Would Have Been One Heck of a Party

There is no doubt that a virtual RSA is not the same as catching up with colleagues and partners over great food, and of course meeting up at the W Bar. The good news is we all have or are adjusting to working remotely and we didn’t have to travel to hear what the industry luminaries think, or what our peers are saying they can do to keep the world safe.

Optimize Monitoring Strategy for End User Productivity

Optimizing the digital experience monitoring (DEM) strategy should be a priority for businesses. According to Forrester, the five principles to optimize end-user experience management are 1. Holistic 2. Workflow-centric 3. Feedback-driven 4. Automated and 5. Quantified. Exoprise offers businesses a 360 degree DEM solution for cloud, network, and workspace digital transformation. The better together monitoring strategy combines real user and synthetic monitoring to deliver actionable insights to IT for SaaS applications whether consumed from home or office. With complete coverage for ALL of Office 365 cloud productivity applications and crowd-sourced benchmarks, Exoprise leads the way to ensure high productivity for the remote workforce.

Benchmark Network Capacity and SaaS Application Performance

Exoprise CloudReady effectively benchmarks SaaS application and network capacity performance through the power of crowd intelligence. This unique approach covers a variety of useful metrics for IT administrators such as Network RTT, Audio Jitter, SharePoint Health, Server Latency, Login Times, etc. Combining application monitoring and end-to-end network diagnostics with the power of crowd-sourced data analytics provides complete visibility into business-critical cloud services as well as insights into the health of the Internet. Reduce MTTR and accelerate troubleshooting during outages by instantly finding bottlenecks in the service delivery chain.

Best Practices to Improve End User Experience Management

Businesses need best practices and implementation strategies to improve the end-user experience for their employees. By combining synthetics and real user monitoring, IT can deliver a seamless Microsoft 365 and SaaS application experience. As work anywhere becomes a dominant reality, employee productivity and technology empowerment will be critical goals to measure. Ultimately, business leaders will need to determine key processes and workflows that need effective monitoring. Dedicating specific staff resources to digital workplace experience will impact end-user productivity.

Extend your Mitel Offering to Improve Service Delivery and User Experience

Winning and retaining customers in a subscription-based business model requires solid and reliable service delivery. For Mitel partners, it has never been more critical for your UC customers to be able to rely on state-of-the-art service quality to maintain their business continuity. Mitel Performance Analytics (MPA) already enables hundreds of partners to ensure exceptional service delivery quality for their customers.  

Not All Metrics Are Good Metrics

The old saying goes like this: “If you don’t measure it, you can’t fix or improve it.” This reflects the obvious notion you can’t measure what you don’t monitor. But this isn’t where the story ends—it’s vitally important to choose, carefully and deliberately, what one measures and monitors. It’s also important to understand metrics in the overall context of an organization’s environment and goals.

US Executive Order on Cybersecurity: What it Means for DevOps

The United States Government equates cybersecurity with national security. That’s the crux of the recent Executive Order that will mandate that not only must software applications be vetted, but there will be upcoming regulations on providing all of the components that make up the software. As section 1 notes: “prevention, detection, assessment, and remediation of cyber incidents is a top priority and essential to national and economic security.”