Operations | Monitoring | ITSM | DevOps | Cloud

March 2022

Revamp your backup strategy and redefine data security this World Backup Day!

Did you know that March 31 is World Backup Day? Who knew data backup was important enough to get its own day? I didn’t until recently, which may be the case for you as well, but that’s okay. It is never too late to learn about the importance of security practices like backing up data and establishing a routine that includes said practices.

A guide to Linux for embedded applications

An embedded device is a hardware and software system that performs a dedicated function within a larger computer system. It is typically resource-constrained and comprises a processing engine. The software of an embedded Linux system runs on top of the Linux kernel, the fundamental core of the OS with complete control over everything occurring in the system. It follows an embedded Linux system simply denotes an embedded system running on the Linux kernel.

Building Your Security Analytics Use Cases

It’s time again for another meeting with senior leadership. You know that they will ask you the hard questions, like “how do you know that your detection and response times are ‘good enough’?” You think you’re doing a good job securing the organization. You haven’t had a security incident yet. At the same time, you also know that you have no way to prove your approach to security is working. You’re reading your threat intelligence feeds.

Announcing Prepress v2

Today I am very excited to announce a major new version of Cloud 66 Prepress, our No Ops tool for deploying static sites. Last year we released Prepress to help our customers deploy static sites to AWS S3. With support for Jekyll, Hugo, and Gatsby, Prepress is everything you need to deploy your static site to your account on all major cloud providers. Today, we are taking Prepress to a whole new level.

OpenStack Yoga on Ubuntu LTS delivers highly performant infrastructure for telcos and researchers with SmartNICs and DPUs

March 31, London – Canonical today announced the general availability of OpenStack Yoga on Ubuntu 22.04 Long Term Support (LTS) Beta and Ubuntu 20.04 LTS. This new version of OpenStack sets a foundation for next-generation, highly performant infrastructure as needed by telco NFV, media streaming, traffic analysis and HPC services, using SmartNIC cards and integrating them with the Neutron Open Virtual Network (OVN) driver.

Why 87% of AI/ML Projects Never Make It Into Production-And How to Fix It

Going from prototype to production is perilous when it comes to artificial intelligence (AI) and machine learning (ML). However, many organizations struggle moving from a prototype on a single machine to a scalable, production-grade deployment. In fact, research has found that the vast majority—87%—of AI projects never make it into production. And for the few models that are ever deployed, it takes 90 days or more to get there.

Networking, security & observability with Cilium

Raymond de Jong, Senior Solutions Architect at Isovalent, will be our guest as we explore Cilium - a BPF powered open-source Cloud Native Networking solution, providing security, observability, scalability, and superior performance. Civo's own Kunal Kushwaha will also look at using Cilium for network policy security on Civo Kubernetes.

Your First Dagger Kubernetes Deployment with Shipa

The DevOps and Platform Engineering space certainly is one that evolves fast. As new development paradigms get consumed, supporting the development pipeline is crucial. Pushing a public release of v0.2.x on March 30th, 2022, Dagger, from the creators of Docker, is another approach in portability and consistency in CI/CD pipelines. What the Docker Container has done applications, Dagger is hoping to achieve that with CI/CD pipelines.

[UPDATED] We're changing the way development environment URLs are generated

UPDATE 6 April 2022: There are times where we have to revisit our plan, and today is one of those times. Last week, we released a fix to solve some inconsistencies in how our development URLs are generated. Even though this change was not impacting any production environments, many customers reported that it was breaking their test integrations with third-party services, or that they were reaching a provisioning limit with our Let’s Encrypt certificates.

How to automate verification of deployments with Argo Rollouts and Elastic Observability

Shipping complex applications at high velocity lead to increased failures. Longer pipelines, scattered microservices, and more code inherently lead to bigger complexity where small mistakes may cost you big time.

New StackPod Episode: Implementing an SRE Practice with Yousef Sedky of Axiom/Hyke

For our latest StackPod episode, we invited Hyke’s DevOps team lead and AWS Cloud architect: Yousef Sedky. Axiom Telecom is one of the largest telephone retailers in the United Arab Emirates and Saudi Arabia and Hyke, its sister company, is a distribution platform for mobile products.

Getting started with DNS attacks

Whenever an online service goes down, you're likely to hear three words: "it was DNS!" Blaming DNS might be a running joke among network admins and engineers, but it's one rooted in experience. DNS problems are known for causing massive, Internet-wide outages such as the 2021 Akamai outage that temporarily made the websites for Delta Air Lines, American Express, Airbnb, and others unreachable.

Difference between Continuous Integration, Continuous Deployment and Continuous Delivery

Continuous integration is a DevOps practice, where developers continuously integrate the code changes into a central repository. It most often refers to the build or the integration stage of the software release process.A continuous integration service helps to automatically build and run unit tests on the new code changes to find any errors instantly.

Spark Performance Monitoring using Graphite and Grafana

In this article, we will explore what Apache Spark is, what key metrics you need to track to keep it running, and how to set up a metrics tracking process. We will also cover monitoring tools such as Graphite and Grafana, which make the process of monitoring metrics very easy, as well as how using MetricFire can make running your monitoring exponentially easier. Check out MetricFire for free or book a demo with our team and learn more about all the benefits of using MetricFire solutions.

The 9 Best AWS Management Tools You Can Use Right Now

Amazon Web Services (AWS) provides over 200 fully-featured services, that aim to make the cloud affordable and cost-efficient for the companies that use the popular cloud provider. Yet, the average AWS customer wastes 30% to 35% of their cloud budget on unnecessary costs. But why? Many organizations running on AWS report challenges managing their infrastructure — and some engineers feel their native tools simply do not cut it for managing their enterprise applications.

Overcoming Kubernetes Infrastructure Challenges at the Edge of the Network

In response to the explosive growth of Internet of Things (IoT) devices, organizations are embracing edge computing systems to better access and understand the enormous amount of data produced by these devices. As the name suggests, edge computing moves some storage and computing resources out of the central data center and closer to where the data is generated at the edge of the network, whether that’s a factory floor, retail store, or automated car.

What's New: Updates to On-Call Management, Incident Response, Event Intelligence, Process Automation, and More!

We’re excited to announce a new set of updates and enhancements to PagerDuty’s Digital Operations Platform. Recent updates from the product team include On-Call Management and Incident Response, Process Automation, to PagerDuty Community & Advocacy Events. New capabilities enable users and customers to resolve incidents faster, do the following, and more.

mooving to... Remote Work | Interview with Tech Expert Martha Sharpe

Remote work is becoming more and more commonplace, but the challenges of working remotely haven’t gotten any easier. Join us as software engineer, author, and adventurer Martha Sharpe discusses how she successfully navigated these challenges while working from the road in an RV.

How to set up Prometheus monitoring for your services

When you run applications in production, you need to monitor the infrastructure they run on - and collect important signals about application health like error rates and latency. In this episode of Engineering for Reliability with Google Cloud, Yuri will demonstrate how to instrument your service to expose application-specific telemetry with Prometheus and how to configure Google's managed service for Prometheus to collect those metrics.

Learn the Basics of Kubernetes Persistence Management Functionality

In this webinar, Oleg Chunikhin, CTO at Kublr, walks you through the basics of Kubernetes (K8s) persistence management functionality and how it can be used to simplify managing persistent applications across different environments - in the cloud or on-premise. Oleg will use a demo environment with clusters in different clouds to show K8s persistence in action. Learn about.

AWS Migration Checklist For Startups

Suppose you are going to adopt AWS as your cloud provider. Whether you are migrating from some other cloud providers or it is your first time setting up your application’s infrastructure on the cloud, This article will be immensely beneficial for you. AWS is an industry leader in cloud innovation technologies and carries the largest market share among cloud providers.

Getting started with Juju and Charmed Operators: three awesome videos

Getting started with software can be confusing – depending on the complexity of the software, of course. Despite the extensive documentation available for Charmed Operator SDK and Juju some just prefer to watch video material to start with. So, let’s take the opportunity to have a look at available tutorials and presentations available on the Internet.

CloudZero Achieves SOC 1 Compliance: Here's Why We Did It

For most companies it’s difficult to organize cloud spend because it relies on manual effort, like tagging. At CloudZero, we’re dedicated to helping customers make sense of their cloud investment without manual and repetitive work. Our code-driven approach to cost allocation makes it easy for customers to organize spend even if they have poor tagging, shared resources, or containerized infrastructure. Quite simply, we organize cloud spend better than anyone else in the world.

Scheduling load tests and persisting output with k6

In this k6 series I have covered HTTP request testing with k6 and performance testing with k6. I designed these tutorials to introduce you to k6 and to show you how to use k6 for performance testing of microservices. As the third tutorial in the k6 series, this will cover how you can store your k6 test results locally and also how to schedule your load tests using CircleCI’s scheduled pipelines feature.

Infrastructure As Apps: The GitOps Future of Infra-as-code

Infrastructure-as-apps builds on infrastructure-as-code to a logical endpoint by bringing in principles of GitOps management. The term is something I coined in 2021 to describe an existing movement to bring infrastructure into the same lifecycle control as applications under GitOps. Examples of Infra-as-apps tools include Argo CD, Crossplane, Cluster API, Cello, or even SchemaHero for databases and the list is always growing. Some of the benefits of infra-as-apps include Read on to understand why.

How Lightspeed optimized iOS test runs with parallelism and caching

At Lightspeed, we maintain multiple large iOS projects as well as their modularized dependencies. The last year of acquisitions brought together many different approaches to CI/CD at our company. I recently led the initiative to bring these projects and practices into alignment. In this post, I will explain the goals we had for our continuous integration pipeline and the implementations we used to achieve them.

What Does VMware Know about Developer Experience?

Ben Hale and Rita Manachi co-wrote this post. With the release of VMware Tanzu Application Platform, VMware is working to address a developer experience crisis that has been fueled by a rich and complex cloud native ecosystem and further complicated by the proliferation of hybrid and multi-cloud environments. There’s understandable skepticism about VMware knowing what makes for a good developer experience given its leadership in infrastructure.

VMware Tanzu Community Edition Taps in Cartographer for Building Secure Adaptable Cloud Native Supply Chains

The latest update to the VMware Tanzu Community Edition further streamlines the path to production with the addition of Cartographer, an open source project to build and manage modern secure software supply chains.

How to Migrate from BIND to AWS Route53 Safely in 3 Commands

Migrating your DNS to a cloud provider like Amazon’s Route53 service can be a daunting task. Thankfully, with dns-tools you can test your DNS records before and after the migration to ensure that everything made it across in one-piece. This is the three steps we follow when migrating to Route 53: Follow along below and in just 10 minutes you’ll know if everything will migrate smoothly for you.

How to Run Java Inside Docker: Best Practices for Building Containerized Web Applications [Tutorial]

Containers are no longer a thing of the future – they are all around us. Companies use them to run everything – from the simplest scripts to large applications. You create a container and run the same thing locally, in the test environment, in QA, and finally in production. A stateless box built with minimal requirements and unlike virtual machines – without the need of virtualizing the whole operating system.

Interacting With Your First Shipa API Call with Postman

The beauty of Shipa is that no matter how the surrounding ecosystem changes e.g your Continuous Delivery or Infrastructure-as-Code stacks, the Shipa API stays the same. If you are curious about interacting with this mystical API, there are a lot of surrounding integrations that do that for you. Though, if you want to directly interact with the API, you can send out HTTP requests to the Shipa API itself to create any sort of integration you require.

Open source security coverage and compliance with Ubuntu Pro on public clouds

For businesses utilising public clouds, choosing an open source platform offers considerable advantages. Open source solutions can help reduce costs, provide access to the most leading-edge enterprise-grade features, and eliminate risks such as vendor lock-in, lack of support, or long-term security maintenance.

How to Manage Cloud Adoption Without Impacting your Users

All companies are going through some form of cloud adoption - whether cloud migration for the first time, hybrid cloud adoption, or extending cloud-native with a newer microservice architecture. But, according to a recent survey by Aptum*, only 39% of companies are completely satisfied with their current rate of digital transformation.

Config best practices: concurrency and parallelism

When is the last time you updated your CI/CD workflow? A year ago? Never? You are not alone, my friends. Reconfiguring workflows can be one of the most daunting tasks for DevOps practitioners. But with new opportunities to benefit from CircleCI plans, there’s one simple and effective place to start: understanding concurrency and parallelism. Using concurrency and parallelism can cut your build times significantly. But you need to know what they are and how to find them in your config file.

SRE vs. Platform Engineering: The Key Differences, Explained

Site Reliability Engineering (SRE) teams and Platform Engineering teams share similar goals -- like maximizing automation and reducing toil -- and similar methodologies. But they have different priorities, and use somewhat different tools to achieve them. What are SREs, what are platform engineers and how is each role similar and different? This article explains.

The Top 6 Challenges Enterprises Face Deploying Kubernetes in Hybrid Cloud Environments

In a relatively short amount of time, Kubernetes has evolved from an internal container orchestration tool at Google to the most important cloud-native technology across the world. Its rise in popularity has made Kubernetes the preferred way to build new software experiences and modernize existing applications at scale in the cloud.

Five Considerations for Choosing Self-Managed Automation vs. SaaS Automation

Sometimes heritage is better than new. Some people favor Coca-Cola Classic over New Coke, and heirloom tomatoes over regular tomatoes. Some Luddites might say the same thing about cloud computing. “I won’t put my (app/data) in the cloud! It will be more (secure | reliable | cheaper) if I run it myself in my own data center.”

How to Gather Insights From Your Network Traffic Pattern Analysis

What’s your network doing right now? Where is traffic flowing to, and where’s it coming from? Are there bottlenecks you don’t know about? Where’s the next problem going to be? Network traffic pattern analysis answers these questions and more. It’s a way for you to examine how your clients use your networks. You may think you know how heavily your clients utilize each segment and VLAN and where the weak points are. But do you?

Git Integration for Jira Cloud Release: Atlassian Data Residency & New User Interface

Git Integration for Jira Cloud has seen some major updates in the last quarter, including new data residency support for the EU and US regions and a BRAND NEW interface for Jira administrators: giving you more control over managing your integrations and repositories directly from Jira. Let’s dive into the new features and improvements for Git Integration for Jira Cloud.

HugOps During Downtime: Building Empathetic Teams

While DevOps focuses on software, HugOps focuses on the people behind the software. HugOps is a way to show empathy and appreciation for the real people who are involved in building, shipping, and running software. It’s a way to acknowledge and celebrate those – the Service Reliability Engineers (SREs), SysAdmins, Engineers, and Support Staff – who are working tirelessly behind the scenes to keep the services that we rely on running smoothly.

Business Activity Monitoring - Achieving end-to-end tracking on business process flow

Serverless360 is a Cloud management platform engineered for Microsoft Azure that brings enterprise-grade monitoring, tracing, remediation & governance under one roof. Everything you need to empower your Azure operations teams with more meaningful features and deliver effortless support. Achieve end-to-end tracking on business process flow across Azure resources and hybrid integrations. Get visibility on the integration solution that the functional operations teams need. Improve operational efficiency with a unified view of business transactions.

Business Applications - Unified Observability for seamless Azure operations

Serverless360 is a cloud management platform engineered for Microsoft Azure that brings enterprise-grade monitoring, tracing, remediation & governance under one roof. Everything you need to empower your Azure operations teams with more meaningful features and deliver effortless support. Visualize your Business Applications, spot any issues, fix them, Take advantage of Serverless360 to improve the overall operational efficiency of your Azure Team!

Azure Documenter - Autogenerate documentation to turn live Azure Subscriptions data into insights

Serverless360 is a Cloud management platform engineered for Microsoft Azure that brings enterprise-grade monitoring, tracing, remediation & governance under one roof. Everything you need to empower your Azure operations teams with more meaningful features and deliver effortless support. With the Azure Documenter feature, Microsoft Azure Subscription is made readable to derive deeper insights. Autogenerate documents in minutes and share with only targeted individuals without putting security at risk.

Getting Started with Gremlin Attacks

Gremlin provides a variety of ways to test the resilience of your systems, which we call "attacks". Running different attacks lets you uncover unexpected behaviors, validate resilience mechanisms, and improve the overall reliability of your systems and services. This ebook explains each of Gremlin's attacks in complete detail, including what each attack does, how it impacts your systems, and the technical and business objectives the attack helps solve.

The journey from mir-kiosk to Ubuntu Frame

We now deliver Ubuntu Frame, a display server for embedded devices that makes it easy for developers to deploy their graphics applications on Linux. Ubuntu Frame simplifies the development of embedded displays such as digital signage solutions, kiosks, IoT devices, robots, and more. In this blog, we will discuss the migration from mir-kiosk to Ubuntu Frame.

Get your Puppet fundamentals free with new instructor-led training

Want to land one of the 20,000 jobs calling for Puppet experience? Or perhaps you already know Puppet but want to advance your skills in order to scale your infrastructure even more quickly and securely? Starting this May, we’re making it easier than ever for you to learn Puppet with an all-new instructor-led curriculum and the opportunity to take all of our Fundamental Core instructor-led training for free.

ValidKube Update: Adding Polaris to Auto-Audit K8s YAMLs

A month and a half ago we released ValidKube, our first OS project that fused the capabilities of three other popular OS tools (kubeval, kubectl-neat and trivy) in a single easy-to-use microsite. Using the microsite, any user could ensure the security and hygiene of their K8s YAML, with just a few clicks of the button, pretty much on the fly. ValidKube was born out of a straightforward concept and we were happy to see its user-friendly approach resonate almost immediately.

5 Cloud Predictions For 2022: Say Goodbye Legacy Cost Management

The past two years have seen erratic cloud spend thanks to the upheaval caused by the global pandemic. Looking ahead to 2022, business is beginning to normalize. Within that “new normal,” there is no question that cloud is more popular than ever, and Kubernetes is at an all-time high. Some companies are continuing to operate remote, or find that they are thriving thanks to an investment in the cloud.

Everything you wanted to know about Securing the Software Supply Chain

You know you need to secure your software supply chain. Everyone’s telling you that these days - your executives, your vendors, even the United States government. Your organization has an initiative to do so, or maybe they’ve brought in an expert to help you achieve this goal. But hold on a minute - do we have a shared understanding of what a software supply chain is, and what exactly makes it secure?

Firmus Supercloud sets a new standard for sustainable cloud computing with Canonical's open infrastructure

March 29th, 2022—Canonical, the publisher of Ubuntu, announces that Firmus, the Australian cloud infrastructure provider that is revolutionising data centre technology, has built its ultra-efficient and sustainable public cloud on Canonical’s Charmed OpenStack and Charmed Kubernetes.

How Girls Who Code Accelerated Kubernetes Adoption During the COVID-19 Pandemic

Running Kubernetes in production at scale can be a huge challenge for today’s organizations. And few companies have the right platform, experience, and skills to get there themselves. This was the case with Girls Who Code, an international nonprofit organization working to close the gender gap in technology, who had to quickly change course and develop things that weren’t on their radar months ago because of the COVID-19 pandemic.

How to build a strong incident response process

When building an incident response process, it’s easy to get overwhelmed by all the moving parts. Less is more: focus first on building solid foundations that you can develop over time. Here are three things we think form a key part of a strong process. I’d recommend taking these one at a time, introducing incident response throughout your organisation. Just being honest: we’re a startup selling incident management software.

Kubernetes Cloud Deployments with Terraform

Kubernetes is a rich ecosystem, and the native YAML or JSON manifest files remain a popular way to deploy applications. YAML’s support for multi-document files makes it often possible to describe complex applications with a single file. The Kubernetes CLI also allows for many individual YAML or JSON files to be applied at once by referencing their parent directory, reducing most Kubernetes deployments to a single kubectl call.

3 Reasons Your CMDB Strategy Isn't Working

A well-maintained, accurate configuration management database (CMDB) is the cornerstone of successful IT operations, offering clarity about what’s in your environment and how it all works in tandem. Yet, statistics indicate that only 25% of organizations are actually receiving meaningful value from their CMDB investments. This is due in large part to ever-changing IT assets and the increasing complexity of infrastructure and applications.

Why ZTNA Solutions are Important Right Now

2021 marked the fifth consecutive year of record-breaking security attacks. Zero-Day attacks skyrocketed, with 66 exploits found to be in use, more than any other year on record and almost double 2020’s figure. Meanwhile, a staggering 66% of organizations have suffered at least one ransomware attack in the last year, with the average ransom payment soaring by 63% to $1.79 million (USD).

Sponsored Post

How important is Observability for SRE?

Observability is what defines a strong SRE team. In this blog, we have covered the importance of observability, and how SREs can leverage it to enhance their business. Observability is the practice of assessing a system's internal state by observing its external outputs. Through instrumentation, systems can provide telemetry such as metrics, traces, and logs that help organizations better understand, debug, maintain and evolve their platforms.

Best Practices For Deploying Web Applications

Deployment is an essential stage of any software development project. With the fast-paced modern agile software development needs, deployment frequency increases rapidly. Although it looks very appealing to often release due to its positive impact on customer satisfaction and improved customer engagement, it is risky at the same time. What if the deployment goes wrong on production?

Top 5 CI/CD best practices

For engineering teams, CI/CD is the key to improving their development cycles. CircleCI is committed to helping our customers optimize their pipelines to streamline delivery to production. If your team values speeding your time to market, commit to trying these 5 best practices. These are CircleCI’s recommendations for cutting your development cycle times and improving your CI/CD processes in general.

Our Approach to Machine Learning

There is a lot of buzz in the world of machine learning (ML) and as a layperson it can be hard to keep up with it all. Therefore, we decided to write down some of our thoughts and musings on how we are approaching ML at Netdata. We’ll touch on the current state of applied ML in industry in general, and zoom in on ML in the monitoring industry.

Rundeck + Squadcast Integration: Simplifying Alert Routing

Rundeck is an automation tool that helps to make existing automation, scripts, and commands more secure, auditable, and easier to run. It is a software Job scheduler and Run Book Automation system that automates routine processes across development and production environments. It brings together tasks scheduling, multi-node command execution, workflow orchestration. It also logs everything that happens in the system. Squadcast is an end-to-end incident response tool.

The 7 Best Cloud Financial Planning Solutions For Managing Costs

Making better decisions doesn’t always require more data. Finding the right data and making sure the right people have it at the right time does. This is particularly important for companies that use the cloud. Using cloud financial planning solutions is an excellent way to automate, extend, and align goals with business outcomes in light of the cloud’s complexity. The solution can also help eliminate cumbersome, error-prone manual processes.

Visibility Anywhere: Key Takeaways from the NetOps Virtual Summit

What do big mountain ascents and modern network operations have in common? You’ll only succeed when you’re learning from experience. This was one among many compelling takeaways that attendees took from our recent NetOps Summit. Centered on the theme “visibility anywhere,” this event featured a number of compelling presentations, including a keynote from Jimmy Chin, the professional climber, photographer, and Academy Award-winning filmmaker.

Kubernetes Easy Button - Running Your JS Apps on Kubernetes with Shipa

Kubernetes is becoming a dominant platform for running workloads. As the Kubernetes ecosystem continues to advance capturing a wider swath of workloads, eventually your code might be headed to Kubernetes. As a Tech Lead at Shipa responsible for front-end engineering e.g what you see on the screen, my job crosses JavaScript Frameworks and Kubernetes on a daily basis.

Unit Test vs Integration Test | Major Difference between Unit Testing and Integration Test

Developing a quality software is considered incomplete without writing tests. Not only does the test assure the quality but it profoundly helps developers while refactoring or re-writing a piece of code. When it comes to testing, having well-planned and thorough testing throughout the software development cycle is very important. The most commonly used types of tests today are unit tests and integration tests.

Monitor your AWS Lambda functions' ephemeral storage usage

AWS Lambda is AWS’s solution for highly portable, serverless computing. With Lambda functions, you can deploy and run business logic code without managing the underlying servers. Today, AWS announced that Lambda customers can now provision up to 10 GB of ephemeral storage for each of their functions, making them well-suited for new, data-intensive workloads—including machine learning inference, large media file processing, financial analysis, and more.

SREcon 2022 Americas Wrap Up

Hi everyone! We had a fantastic time at SREcon 2022 Americas last week, and I thought I’d share our stories and experiences. As the SRE community grows and evolves, these chances for collaboration become more and more important… and fun! Although I only attended virtually, I could still feel an exciting atmosphere as great minds came together.

SolarWinds Orion + Squadcast: Alert Routing Made Easy

SolarWinds Orion is a scalable infrastructure monitoring and management platform. It is designed to simplify IT administration for on-premises, hybrid, and software as a service (SaaS) environments, in a single pane of glass. SolarWinds Orion ensures you do not have to struggle with numerous incompatible point monitoring products, as it consolidates the full suite of monitoring capabilities into one platform with cross-stack integrated functionality. Squadcast is an end-to-end incident response tool.

Why DevOps is all about creating one team

A Contract Solutions Architect for a manufacturing software development company, Tonie Huizer has 20+ years’ experience working with Azure, SQL and other Microsoft technologies. A Microsoft Certified Azure Developer Associate, he has the knowledge to design, build, test, and maintain cloud applications and services on Microsoft Azure.

New StackPod Episode: Best Practices for AWS Observability With Russell Foster of StackState

We’re excited to share that we are celebrating our tenth podcast episode! For this episode, we invited Russell Foster. As a DevOps engineer at StackState, Russell is responsible for making sure our SaaS product runs smoothly on AWS. Over the years, Russell has worked at both startups and more mature companies, where his responsibilities ranged from keeping things up and running in cloud environments to making sure hybrid and on-premise environments remain stable and reliable.

How Many Servers Do I Need for Your Monitoring Solution?

What kind of questions do you ask when you look at infrastructure monitoring tools? You probably start with some of the more important ones: All of the above are good questions and definitely should be asked. After all, if you acquire a solution that only monitors half of your network devices, has an incredibly complicated and difficult to use UI or doesn’t move you from reactive to proactive, the benefits are limited.

Simplify, Secure, and Optimize your Multi-cloud Container Infrastructure with VMware Tanzu for Kubernetes Operations

Ning Ge and Keith Miracle co-wrote this post. Amidst many social and economic disruptions that have arisen in the last few years, enterprises have been forced to quicken the pace of their digital transformation initiatives, adding and consuming cloud-based capacity and capability just to stay competitive, relevant, and, for some, in business.

Continuous Availability: How It's Changed, and Why It's Critical

Remember when Slack went down in early January? The three-hour outage, set off by AWS capacity issues, cost the company an untold amount of money. And the effects rippled across the enterprise. The outage devalued the company’s stock and seemed to send all 142,000 of its customers to Twitter to gripe. This high-profile outage is just the most recent of many outages highlighting the critical nature of continuous availability. And there’s only one answer to the problem.

Shift Left Reliability Meetup March -Reliability patterns for serverless applications

Serverless technologies offer a great foundation for building resilient applications that can withstand much turbulence in the production environment. For instance, AWS Lambda automatically deploys your code to three availability zones and replaces faulty virtual machines on the fly. Despite this, there are still many other types of failures that can still affect our application. Perhaps there is an outage with a third-party service we depend on, or maybe a sudden surge in throughput has pushed us over the throughput limit and caused some user requests to be throttled.
Sponsored Post

Orchestration vs Automation: Which Does Your Business Need?

Digital transformation is accelerating rapidly to include virtually all enterprise functions. Organizations of all size, across all industries, are leveraging digital technology to enhance customer service and improve work efficiency. Integrating automation into core business functions has become a must to stay aligned with the ongoing digital revolution. The growing migration to the cloud has resulted in the distribution of company data and applications across multiple locations. This means that many complex business processes must leverage IT resources from the cloud and on-premises. This is where automation and orchestration can greatly improve the performance and efficiency of these complex tasks.

Practical Tips & Tricks For Securing Your CI/CD Pipelines

Many enterprises still struggle to get security right. To protect their business, it is critical they focus on security during the entire infrastructure and application lifecycle, including continuous integration and deployment. In this workshop, we will cover security mechanisms you can employ in your CI/CD pipelines to tighten security while enabling developers to push their code, quickly and safely.

Improve Observability in Your CI/CD Pipeline

The most basic component of automated software development is a CI/CD pipeline. While the term "pipeline" has been used to describe a wide range of computer science concepts, we use it at CircleCI and throughout the DevOps industry to refer to the vast range of behaviors and activities that are involved in continuous integration (CI).

Flyway Desktop: Working with PostgreSQL

Flyway Desktop now supports PostgreSQL. Now you can take advantage of a hybrid development environment that allows you control your database objects in a state-based manner in development, but then take full advantage of a migrations approach for deployments. This makes using Flyway for automated deployments even easier to manage and implement.

Kubernetes Master Class: Creating RKE2 Cluster Templates

Rancher 2.6 introduces a new Cluster-API based provisioning mechanism for RKE2 and K3s clusters. This also brings a completely new cluster templating system, which is based on Helm charts and is much more flexible compared to the old RKE1 cluster templates. In this master class, you will learn how the Cluster API works, how you can leverage it in Helm Charts, how to do versioning and how to create a nice UI wizard for them.

What Are the Differences Between Elastic Beanstalk, EKS, ECS, EC2, Lambda, and Fargate?

Life before containerization was a sore spot for developers. The satisfaction of writing code was constantly overshadowed by the frustration of attempting to force code into production. For many, deployments meant hours of reconfiguring libraries and dependencies for each environment. It was a tedious process prone to error, and it led to a lot of rework. Today, developers can deploy code using new technology such as cloud computing, containers, and container orchestration.

AWS Cost Allocation Tags Explained: When Should You Use Them?

Knowing only how much you spent on your AWS bill each month isn’t enough to provide the cost visibility you need to make cost-aware engineering and business decisions. It's why AWS introduced tags. Tags enable AWS users to label their resources to track cost and usage in the vast AWS infrastructure. The goal of tagging is to help you understand who, what, and why your cloud spend is changing.

DevOps State of Mind Ep. 9: Recruiting for a DevOps Culture

Liesse Jones: Today we're joined by Anna-Marie Gutierrez-Lee, affectionately known as AMG, who's the Director of Talent Acquisition at LogDNA. She's passionate about mentoring recruiting teams and connecting talent to their dream careers, while fostering a genuine and positive candidate experience. Today, we're going to talk about how to recruit for a DevOps culture and why it's so important to bring more underrepresented talent into tech.

Rancher Desktop Now Includes The Rancher Dashboard

With the 1.2.0 release of Rancher Desktop, there are two new features available as a Feature Preview. Rancher, the multi-cluster Kubernetes manager, includes a dashboard which enables you see and interact with resources in a Kubernetes cluster. Rancher Desktop now includes this dashboard. The dashboard will enable you to view and interact with resources in your local cluster provided by Rancher Desktop.

Don't Forget About Kubernetes Jobs - Shipa Jobs Support

When I was making my first switch from a product engineering team to being field facing software engineer, one of my first projects was an integration project for a federal agency. The very first piece of enterprise software minus my productivity and development suite I was exposed to was BMC’s Control-M about 15 years ago. A lot of batch files to extract and transform data had to be run in order and on a daily basis; Control-M at the time was a job runner.

How to Model Your Gitops Environments and Promote Releases between Them

Two of the most important questions that people ask themselves on day 2 after adopting GitOps are: In the previous article of the series, I focused on what NOT to do and explained why using Git branches for different environments is a bad idea. I also hinted that the “environment-per-folder” approach is a better idea. This article has proved hugely popular and several people wanted to see all the details about the suggested structure for environments when folders are used.

Benefits of scheduling continuous integration pipelines

Scheduling is an integral part of software development practices. Tools for scheduling jobs help development teams save time by scheduling recurring tasks — like modifying a database or sending out periodic emails — for execution at specified times. There are many to choose from, including cron for Linux, scheduled tasks for Windows, launchd for macOS, Jobber, and anacron.

Logic App Best practices, Tips and Tricks: #7 Learn from failures

Welcome once again to another Logic App Best practices, Tips, and Tricks. In my previous blog posts, I talked about some of the most essential best practices you should have while working with the Azure Logic App: And some tips and tricks: Today I’m going to speak about another critical Best practice, Tips, and Tricks that is often overlooked: learning from failures.

GitKraken Client Tutorial: How to Use the Git GUI & CLI

GitKraken Client is the most popular Git client in the world and the only one that comes equipped with both a graphical user interface (GUI) and command line interface (CLI). In this tutorial video you’ll learn how to easily and safely leverage the full power of Git. Use chapters to quickly skip ahead.

Implementing Service Reliability In The World Of Remote Teams

In this new era that we are moving into, what does successful reliability look like for modern teams and what are the requirements that will enable us to bring better reliability for our applications and system? With new ways of working, we explore how organziations should implement better service reliability and the different challenges teams are facing.

Five Phases Of Effective Reliability Within Organizations

Reliability is important to everybody in a business. There’s a common misconception that it’s just important to engineers. We must change this mindset and think of reliability as a team sport that everyone needs to be part of. As an organization, there are five key phases to implementing effective reliability across teams.

A practical approach to Active Directory Domain Services, Part 2: Active Directory and the Domain Name System

For readers who have returned to this blog after understanding the basics of Active Directory (AD) in part 1 of this series, welcome back! For all new readers: Hello! Get ready to jump into the world of AD. It would be good to take a quick peek at what was covered in part 1 before you continue. Be sure to read through part 1 as it will be your guide to: Part 2 of this series aims to introduce the interrelation of AD with the Domain Name System (DNS).

How to Scale your AWS Infrastructure - Part 2

Welcome to the second post in a series of “How to Scale your AWS Infrastructure”. In the first post, we talked about horizontal scaling, autoscaling, CI/CD, infrastructure automation, containerization, etc. In this post, we will continue the discussion around databases, loose coupling, caching, CDN, etc. Let’s start the discussion with database scaling.

Podcast: Break Things on Purpose | Chris Martello: Day of Darkness

Dad jokes lead the way in this episode as we interview Chris Martello, manager of application performance at Cengage. Chris is a wearer of many testing hats, but his passion is chaos and breaking things on purpose. Chaos was a natural fit for Chris with his background as a middle school science teacher, so when he made the jump to tech chaos engineering was a natural fit.

PagerDuty Runbook Automation Joins the PagerDuty Process Automation Portfolio

Spring is blooming here at PagerDuty, and so is our automation product line. We’re thrilled to share some exciting product announcements. First, we’ve officially rebranded our automation product line, Rundeck®, as PagerDuty® Process Automation. Fundamentally, everyone who buys Rundeck becomes a PagerDuty customer, so we decided to make it less confusing.

What's a fair compensation for being on-call?

For the vast majority of organisations, it’s necessary to have some form of round the clock cover to support the business. Whilst it’s most commonly a concern for engineering, it’s increasingly common to have folks from various disciplines available out-of-hours. Irrespective of role, compensating people fairly is an important factor of running a healthy and effective on-call system.

Join the Smart Cloud-Native Revolution

We are in the midst of a digital revolution that started with the PC, Internet, and mobile phone and has continued to accelerate. In this current wave, the cloud, Kubernetes, artificial intelligence (AI), and intelligent automation are combining to create the next major disruption, which we call smart cloud-native. Smart cloud-native is a powerful force that is transforming data centers, workforces, customer experiences, and the way enterprises do business.

Zenoss Core Sunset

Last week, we announced that we are sunsetting Zenoss Community Edition, which was previously called Zenoss Core. (The title of this blog post refers to it as Zenoss Core, as that was the name for most of those 15 years and is the name by which most people know it.) Zenoss Community Edition was a free, on-prem monitoring tool the company made available for over 15 years, which had been downloaded millions of times. Zenoss Community Edition version 1.0 was released Nov. 15, 2006.

Decoding the robust Azure architectures with fail-proof monitoring

Have you just begun your cloud journey to Azure by moving away from on-prem? If so, it’s always better to opt for the right set of patterns and strategies for developing and monitoring your applications on the Azure cloud. In this webinar, Tord Glad Nordahl, Microsoft Azure MVP, exclusively exposed the secret sauce for building and monitoring innovative cloud-native applications. Major topics covered,

Using Telegraf to send syslog metrics to Graphite

When you own and operate software, they generate various types of logs from disparate sources such as databases, servers, and applications. The metrics from these important digital assets are what companies monitor continuously. When they show you a sign of unreliability, companies need to take swift actions to fit the cause and prevent it from growing to a larger problem. The key to success in this activity is owning a good Syslog application and metrics software where you can clearly see metrics.

The Ugly Truth About (Most) Cloud Rightsizing Recommendations

Rightsizing is about finding the optimal cloud configuration options to ensure that you get the performance you need—within any given constraints you are operating under—at the lowest possible cost. This is a simple proposition, but deceptively so. For one thing, business requirements are constantly changing, meaning that your workloads must adapt to support them, which in turn changes their operating parameters.

CI/CD Benchmarks for High Performing Teams in 2022

Software delivery has never been more critical to the success of business in every industry. It’s also never been more complex. With the growing challenges of complexity, how can engineering teams succeed? CircleCI examined 55 million data points from more than 44,000 organizations and 160,000 projects to help guide team development and software delivery decisions. Benchmarks from the report show that the highest performing teams prioritize being in a state of deploy-readiness, deploy more often and recover faster.

Foolproof Cloud Monitoring: 6 Ways to Utilize the Tools at Your Disposal

The cloud offers unparalleled flexibility. However, that flexibility comes at a cost. The amount of moving pieces increases. The environment becomes more heterogeneous. So, if you want to stay on top of things, you need a more comprehensive view of your cloud infrastructure. After all, you don’t want your customers to realize that something has gone awry before your people do. In this post, I’m going to talk about cloud monitoring.

Build an automated invoice generator application

As a software engineer and technical content creator, I work with a lot of companies on many different contracts. To get paid for my work, most companies require that I send an invoice. Sometimes they want one daily, at the end of the week, or even when the project has been completed. Sending an invoice to my clients is crucial because it determines when and if I will get paid on time. If this sounds like a repetitive task that can eat deep into my productive hours, you are right.

How to Kustomize your Codefresh/Argo Runtime

The Codefresh Software Delivery Platform (CSDP) brings together the complete open source Argo toolset (Workflows, Events, CD, and Rollouts) into a single platform for enhanced efficiency and visibility of software deployments at massive scale. If you’re a new CSDP user, one of the first things you’ll do is install the CSDP runtime in one of your Kubernetes clusters.

6 Metrics to Watch for on Your Kubernetes Cluster

Kubernetes. Nowadays it seems companies in the industry are divided into two pools: those that already use it heavily for their production workloads and those that are migrating their workloads into it. The issue with Kubernetes is that it is not a single system the way Redis RabbitMQ or PostgreSQL are. It is a combination of several control plane components (for example etcd, api server) that run our workloads on the user (data) plane over a fleet of VMs.

The Six Trends Overwhelming IT Ops-and What to Do About Them

IT Operations is experiencing lightning-fast change right now. From the emergence of cloud computing to the explosion of data—not to mention ever-present cyber threats—every day is a new day for IT Ops. At BigPanda, we’re laser-focused on making life easier for IT Ops teams, which means we’re staying on top of all this change to help IT Ops keep up.

Centrally Manage, Secure, and Monitor Kubernetes using VMware Tanzu for Kubernetes Operations

Kubernetes has become the de facto platform for running containerized workloads. Kubernetes brings a set of APIs for managing applications that can work with multiple infrastructure/cloud providers. Whether you want to deploy a containerized application on vSphere, AWS, or Azure, as long as Kubernetes is deployed in these environments, the API being used to request a container deployment stays the same. This helps application development teams tremendously.

Logic App Best practices, Tips and Tricks: #6 Error handling... configure run after settings

In my previous blog posts, I talked about some of the most essential best practices you should have while working with the Azure Logic App: And some tips and tricks: Today I’m going to speak about another critical Best practice, Tips, and Tricks: implementing Error handling inside Logic Apps.

5 takeaways from the CNCF Annual Survey 2021

The CNCF Annual Survey 2021 is in and makes for some very encouraging reading for the future of Kubernetes and its place in the tech landscape. The 2021 survey was the biggest yet, with some 3,829 developers, engineers, architects, and C-level execs in the cloud native space taking part. Here are some of our key takeaways…

Tutorial: How to Connect Jupyter Notebooks to Ocean for Apache Spark

Jupyter Notebook is a web-based interactive computational environment for creating notebook documents. It supports programming languages – such as Python, Scala, R – and is largely used for data engineering, data analysis, machine learning, and further interactive, exploratory computing. Think of notebooks like a developer console or terminal, but with an intuitive UI that allows for efficient iteration, debugging or exploration.

Unified Serverless Observability With OpenTelemetry and StackState v4.6

StackState has always believed in the importance of open source and open standards, and we’ve demonstrated our commitment through ongoing support of open technologies. From the beginning, StackState supported StatsD and OpenMetrics. Even our agent is open source, designed to help organizations easily onboard our platform and to give them an extensible open way to observe their services. StackState is now proud to announce our next big open source step.

The Best Engineering OKRs: 5 Real Examples That Get Results

As a framework for goal-setting, the Objectives and Key Results (OKRs) methodology is an incredibly useful tool when implemented properly. It can help your engineering team better plan and stay aligned toward common objectives through the duration of the development process. On an individual level, OKRs can also encourage each team member to make personal progress toward their own goals or common team goals that would benefit the company.

How to Scale your AWS Infrastructure - Part 1

When designing a solution, you should keep future needs in mind. If the number of users increases dramatically in a short period of time, the solution should be scalable enough to handle the new growth. Making systems scalable on cloud is relatively easier as compared to scaling on-premises infrastructure. AWS has provided excellent tools/services to enable your applications for as much scalability as you want.

How ChatOps Helps IT Teams Work More Effectively

From setting up new hires with everything they need to get to work to troubleshooting technical difficulties, IT teams often field the same kinds of requests over and over. And while each request might feel like a small task, collectively they can add up to a huge time sink in the long run.

Schedule database backups for MongoDB in a Node.js application

Database backup protects your data by creating a copy of your database locally, or remotely on a backup server. This operation is often performed manually by database administrators. Like every other human-dependent activity, it is susceptible to errors and requires lots of time. Regularly scheduled backups go a long way to safeguarding your customers’ details in the case of operating system failure or security breach.

Monitoring as code for modern DevOps teams

Software engineering teams that adopt “as-code” practices, like using configuration files and automated workflows instead of manual configuration and tools, gain major improvements in velocity. But even companies that enjoy the success of as-code practices for development and delivery lag behind in applying them to operational concerns like monitoring and observability.

Observability vs Visibility - what's the difference?

Observability is a new term that’s slowly entered the mainstream over the last two years. Today it’s used in the context of monitoring, but it’s much more than that. And it also goes way beyond visibility. So, in this blog, we set out to explore observability vs visibility and find out, what’s the difference? In a recent podcast, our friends at Riverbed neatly explained that seeing and observing are two different things, and can be compared to hearing vs listening.

Honeycomb + Squadcast Integration: Routing Incident Alerts Made Easy

Honeycomb is an application monitoring tool that helps DevOps and SRE teams to operate more efficiently by offering rich observability solutions and intuitive team collaboration. It helps understand complex relationships within your distributed systems and troubleshoot issues accordingly. Squadcast is an end-to-end incident response tool. Built with an SRE mindset, it streamlines all the incident response activities.

AWS Budgets Vs. AWS Cost Explorer: The Ultimate Comparison Guide

AWS currently offers over 200 services. Some of those make up the AWS Cost Management suite. This group comprises AWS Cost and Usage Report (CUR), AWS Budgets, AWS Cost Explorer, AWS Cost Categories, and AWS Cost Anomaly Detection. Budgets and Cost Explorer are an excellent pair of complementary tools in this group. They have similarities that cause users to wonder if they need both and, if so, why. Here is a brief overview of the differences between AWS Budgets and Cost Explorer.

Logic App Best practices, Tips and Tricks: #5 Delete comments

Are you surprised? Are you under where are the first four tips? I start this series of blog posts on my blog, and you can see and read the previous Best practices, Tips, and Tricks here: And I will be sharing some of them here and others on my blog. So stay tuned for both blogs. Of course, the most recurring task is adding comments to our triggers and actions, but it is always good to know you to delete them. Some of you may be thinking that is a trivial task, simple like adding a comment.

Docker's genius shift: How one decision set the course for success ft. Justin Cormack

Docker CTO Justin Cormack joins Rob Zuber to discuss how Docker moved from very few paid users among millions of others to their current level of success. In this episode, Cormack shares the paths Docker took to transform its massive user base into a sustainable business model. Hint: sometimes product-market fit might be right in front of you. Have a topic you want to discuss? Reach out to us on Twitter @circleci!

DevOps vs SRE - Reducing Technical Debt and Increasing Efficiency and Resiliency

One more blog topic stemming from our weekly office hours that we hold with the field team here at Shipa. In our last office hours, was asked a question about “what are the difference between DevOps Engineers and SREs?”. Both professions are emerging disciplines and cultures that continue to evolve and play an importance in technology organizations. I’ve been fortunate to have written and spoken about this before; though taking a fresh look at what the two domains try to accomplish.

Observability versus monitoring in software development

To supervise the behavior of distributed applications and track the origin of service failures and downtime, developers often use traditional monitoring technologies and tools. However, this approach can fall short in its ability to measure the overall health of modern cloud-native architectures, which can span multiple hosting environments and encompass hundreds of microservices.

What Is Automated Discovery and Dependency Mapping (DDM) and Why Do You Need It?

In a perfect world, your Configuration Management Database (CMDB) acts as the single source of truth for all your IT device inventory and the relationships between those devices. However, maintaining accuracy is easier said than done. That’s because the traditional method for provisioning and maintaining a CMDB is complex, unwieldy, and outdated the second it's updated. To keep up with the needs of a modern CMDB, an automated discovery and dependency mapping (DDM) solution is a must.

Quick-Start Guide to Using VMware Tanzu Mission Control and vSphere with Tanzu Services

Explosive growth of web traffic and services is forcing organizations to modernize and optimize their infrastructures. Kubernetes is core to the strategy and modernization story, but it’s only one piece. As VMware engages with its customers, significant complexities and resource needs arise that are not always apparent in the planning stages of Kubernetes deployments. The complexity of even a single deployment can introduce delays and slow projects to a crawl.

Platform.sh commits to helping its customers reduce carbon emissions from cloud activities

Platform.sh, a unified, secure, enterprise-grade platform for building, running and scaling web applications, has worked with Greenly to calculate its carbon emissions to provide a clear picture to its customers.

Why just one person can't buy things that work well

It's too difficult for product teams to find the right vendors. Vendors obscure details, promise everything or downright lie, have special pricing for those who know how to ask, and there are just too many of them! This problem is getting worse because of the "Cambrian explosion" in cloud tooling, a blossoming in the number of solutions and niche specializations.

Salesforce Cloud + Squadcast Integration: Routing Detailed Incident Alerts

Salesforce Cloud is one of the leading cloud-based customer relationship management (CRM) solutions. It provides a shared view of your customers and their relationship with the business. With Salesforce Cloud, users can automate service processes and streamline workflows. Squadcast is an end-to-end incident response tool. Built with an SRE mindset, it streamlines all the incident response activities. Squadcast aligns your teams towards a common organizational goal of better reliability.

A B2B sales stack from Seed to Series A

I joined incident.io recently to lead Sales, after having set up my own company. In both startups, one of the first questions I’ve landed on was: “What sales tools should we use as we scale?”. In this post, I’ll walk through our sales stack, and by extension, what I think most B2B SaaS startups can get away using when they have less than ~100 employees.

Closing the Gap: Deploying Automation the Right Way

Automation in the enterprise is nothing new. Engineers have been working with automation tools and frameworks for decades. From configuration management tools, to continuous integration and delivery pipelines to cloud formation, you name it—automation is part of the fabric of nearly any technology use case in the business landscape. If the previous statement is true, then why does automation still seem to pair with so much manual work?

Meet VirtualMetric's VMware Monitoring

With #VirtualMetric VMware Monitoring, you can see all your #VMware clusters, vCenters, servers, datastore, virtual machines, resource pools, etc. You also get 360 degrees observability over your VMware environment. Get detailed statistics on processors, memory, storage usage and network information. VMware Monitoring is a Powerful and Easy To Use Server Monitoring and Management Tool​

Getting started with Packet Loss attacks

Imagine this: you're in the middle of an important presentation when all of a sudden your video feed starts to stutter. You hear other people speaking, but their words are choppy. A message comes through Slack from one of your co-workers: "I think your connection cut out." You scramble to try different solutions—restarting your videoconferencing application, checking your Internet connection, switching to your phone—but ultimately, your presentation gets cut short.

Introducing Bitbucket's redesigned Branch page

We are excited to announce that improvements to the Branch page will be available in Bitbucket Cloud in the coming weeks! Comparing two branches can be a critical step before creating a pull request. We recognize that it can be a cumbersome experience to see the Branch page displayed differently than the Pull request page.

Continuous integration for Angular applications

Automated testing is the foundation of your continuous integration practice. Automated testing clarifies the status of build processes for your team’s applications, ensures that tests run on every commit or pull request, and guarantees that you can make quick bug fixes before deploying to the production environment. In this tutorial, I will show you how to automate the testing of an Angular application.

Galileo Enhancements for Brocade Data Collection

Are you migrating to Azure or another public Cloud? A client of ours did, and they didn’t use Galileo Cloud Compass. Do you know what happened? They didn’t experience the savings they expected, and their Azure invoices were a lot more than expected—about 60% more. There are a couple of reasons for this. One is that they didn’t have an accurate way to price their workload for the Cloud. Just think about understanding the pricing model for each of the major Cloud vendors!

Why Right-Sizing in the Cloud is Everything

Are you migrating to Azure or another public Cloud? A client of ours did, and they didn’t use Galileo Cloud Compass. Do you know what happened? They didn’t experience the savings they expected, and their Azure invoices were a lot more than expected—about 60% more. There are a couple of reasons for this. One is that they didn’t have an accurate way to price their workload for the Cloud. Just think about understanding the pricing model for each of the major Cloud vendors!

Wind River Studio Addresses Challenges of Managing Secure Linux-Based Intelligent Systems

Wind River eases costly management challenges typical of embedded Linux platforms for the full lifecycle. New managed services help teams achieve and maintain Linux platform stability, quality, and security, freeing resources to continue innovation and feature development. Wind River experts assess, recommend, and implement solutions aligned with functional, architecture, and performance requirements.

Should Your Startup Use AWS Managed Services?

Let’s face it. Gaining a competitive advantage in the target market is expensive. Even if you have a good idea and its execution plan in mind, operations related to management, storage, networking, service provisioning, security, and application management will cost you a fortune. To say the least, a cutting-edge IT infrastructure, a reliable team, and a strategy for rapid product releases or expansion/scaling is a must for your product’s success.

10 Best Snowflake Monitoring Tools (Updated 2022)

The Snowflake data cloud offers powerful data warehousing, analytics, and processing tools. The platform can handle many data workloads on one platform, helping organizations turn data into actionable insight across teams, departments, and regions. Snowflake’s architecture is also unique. Compute and storage are completely independent and both are highly elastic. Yet, Snowflake's per-second billing and highly elastic compute model demand frequent usage and cost monitoring.

How to Use Pub Repositories in Artifactory

If you’re one of the growing number of client app developers embracing the Dart programming language and Flutter and AngularDart toolkits, we’ve got some exciting news for you! JFrog can now welcome Dart developers to the empowerment of Artifactory’s robust binaries management and the ways that it contributes to continuous integration.

Automate the deployment of Angular apps to Firebase

Developers use JavaScript frameworks like Angular, React, and Vue.js to build every kind of single page application, from simple to complex. By separating JavaScript and CSS, frameworks let dev teams structure applications in modular chunks of code that carry out a single function. That is great, but once your application is ready for deployment to production, you will need a command to compile and bundle the separate files into a single one.

Cloudsmith Supports OpenSSF's Efforts to Secure OSS

As part of our mission to make it simple to secure software at scale through Continuous Packaging, Cloudsmith is excited to announce that we have become an Open Source Security Foundation (OpenSSF) member. OpenSSF is a cross-industry forum for a collaborative effort to improve security in open source software (OSS). One software pipeline's output is another's dependency- we are all splashing around in each other's supply chains.

The Anatomy of a Rollback Deployment Workflow

Your new release tested fine on staging, but it’s not playing nicely with applications and services in the wild. Your monitoring application notices something going wrong and raises the alarm. But often raising the alarm isn’t enough – to solve complex issues, you might need to roll back to the last good deployment while you figure out the root cause and get multiple people working together on the solution.

Go 1.18 released on Platform.sh

As of yesterday, the team behind Go has released a new version 1.18 with some significant changes to the language. Those of you who want to start using these new features are in luck: you can do it right away on Platform.sh. If you’re already using Go on Platform.sh, you can upgrade by changing the number in the type key of your app configuration. In your.platform.app.yaml file: If you’re not yet using Go for your project, now’s a great time to give Go a try.

Learn How Tanzu Observability Helps OpenShift Users Manage the Grafana Licensing Change

Grafana Labs recently announced that they are relicensing their core projects from Apache 2.0 to Affero General Public License (AGPL) version 3. This is great news for the open source community, since the new license is still Open Source Initiative–approved and adheres to an additional clause in which network access of any AGPL-licensed software counts as a type of distribution.

A practical approach to Active Directory Domain Services, Part 1: A beginner's guide to Active Directory

Active Directory Domain Services (AD DS) is the traditional, on-premises domain service offered by Microsoft. It is the core component and a server role in Active Directory (AD), the specialized, proprietary directory service in Windows operating system environments. Consider an enterprise or a complex business set up with many connected network resources. In order to ensure the effective management of these resources, IT administrators use AD and its components, including AD DS.

Pulumi or Terraform for applications? Maybe, both?

Cloud-native is an evolving architecture. Existing vendors will keep on evolving their offerings and different teams inside your organization should be able to use the tool that will support them better in delivering their desired outcome fast. By implementing a standard application layer, you enable teams to adopt what works best for them while the DevOps team can focus on adopting the infrastructure components they believe to be best to support their organization.

Ocean for Apache Spark goes GA on AWS

When Apache Spark introduced native support for Kubernetes it was a game changer for big data. Speed, scale and flexibility are now at the fingertips of data teams—-if they can master Kubernetes. It’s an uphill climb for even experienced DevOps teams. At Spot by NetApp, we’ve seen first-hand the challenges that companies are facing as they navigate the complexities of operating large-scale Kubernetes applications.

What does Pinal Dave think of SQL Monitor?

Last week we had the pleasure of speaking to SQL Authority’s Pinal Dave to show him some of our favorite SQL Monitor features! Pinal has used – and been a fan of – SQL Monitor since it launched in 2008 (fun fact: it was named SQL Response back then). There are, however, some newer features that Pinal isn’t too familiar with, and we were delighted to introduce those to him.

Discover 2022 DevOps trends with CircleCI data report

If you’re like many of our customers, the phrase software supply chain entered your lexicon this year. You’ve begun to feel the complexities and vulnerabilities of that supply chain. You’ve connected the dots between more reliable software delivery and business success. You’re recognizing the gains developer efficiency can have on profitability.

Deploy a Go app from repo to AWS

We are going to deploy a Go application directly from your repo to AWS with Cloud 66. Any application using any language on any framework can be deployed with Cloud 66 as long as it has a Dockerfile. Note: Rails applications are exceptions as we deploy them natively. If your application does not have a Dockerfile, we will suggest one for you based on your code. However, we would recommend reviewing what we have suggested and making sure the Dockerfile meets your requirements.

What are Linux containers?

Over the last decade, containers have become an essential part of running infrastructure more efficiently. Containers enable productivity, automation, and cost-effective deployments. But there are different types of containers to consider, and this blog explains what Linux containers are, and how they differ from application containers.

FireHydrant is now on Microsoft Teams

Engineering teams can now manage incidents in Microsoft Teams. You’ll have the consistent process and automation of FireHydrant right in the messaging tool you use every day. Effectively run through the entire incident response lifecycle: declare and manage incidents, collaborate with stakeholders, and resolve incidents faster when you integrate FireHydrant with Microsoft Teams.

Honeycomb Terraform Provider Now Officially Supported by Honeycomb

Previously announced as a community-led project, the Terraform provider for Honeycomb is now officially maintained by Honeycomb in partnership with Hashicorp. We recognize how valuable supporting configuration as code is for our customers, and this change in ownership affirms our commitment to ensuring your ability to quickly make the most of Honeycomb’s Management API.

The Dual Approach in Scaling: Chaos Engineering and Performance Engineering

For any enterprise, they're more than likely all too familiar with the struggles and complexities of scaling their environments and applications. Whether these applications live on premise, in a cloud environment, or somewhere between in a hybrid state, an age-old question engineering ponders on is, “Can my application and environment scale?

Severity Levels (What They Are & Why They Matter)

Wondering about severity levels? We explain what incident severity levels are, how to classify them, and how they will affect your incident management process. What are severity levels? Incident severity levels are the measure of the impact an incident will have on a system. In general, a lower number severity level, such as SEV-1, denotes a higher impact on the system.

Contributor's Box (Level 1) - Unboxing the Codefresh Open Source Maintainer's

As we work diligently on transforming Codefresh into an Open Source company, we created THE MAINTAINER'S CLUB. The Maintainer's Club is a set of incentives and onramps to becoming more active in the open source community, specifically the Argo Project. There are three levels 1) Contributor 2) Member 3) Maintainer In this video, Dan Garfield, Co-Founder and Chief Open Source Officer unboxes the level 1 or Contributor Box. Check it out!

To NuGet and Beyond: NuGet Ecosystem & Upstream Support at Cloudsmith

Calling All.Net / C# / PowerShell Dev’s! We heard you! While Cloudsmith has supported NuGet packages for a while now, we’ve now got more robust support for the NuGet ecosystem. Whether it’s a V2/V3 NuGet package you created in Visual Studio, a Chocolatey package, a PowerShell Module, or a dependent package from NuGet.org, they can all be hosted in the SAME Cloudsmith repository! This one-hour webinar event discusses and demos the latest NuGet ecosystem and upstream support now available at Cloudsmith.

Managed and Unmanaged Clusters in VMware Tanzu Community Edition: What You Need to Know

With VMware Tanzu Community Edition you can create managed and unmanaged Kubernetes clusters. What’s the difference? Why might you be better served by one or the other? What are typical use cases for each? In this engaging chalk talk–style video, Steve Pousty and Whitney Lee answer these questions and more.

A guide to Microsoft Azure Regions

The global footprint of Microsoft Azure is made up of physical infrastructure of over 200 data centres and connective network components, arranged into regions, and linked by a large interconnected network. Each of the Azure data centres provides high availability, low latency, scalable cloud services close to users to improve reliability and speed. In this blog, we look at the Azure Regions, and explain the benefits of using a direct connection to access Azure infrastructure.

What is a YAML? - A Box of DevOps?

I recently returned from a birthday trip to Napa Valley and got to spend some time with the Shipa Team in Palo Alto during the trip. Grabbing a coffee on my trek back to San Francisco, I overheard someone talking about YAML at the coffee shop and I had to hold back my laugh. You usually do not hear folks talking about YAML out in the public but this is San Francisco. For many engineers, YAML is a way of life.

How To Conduct A Cloud Cost Analysis: A Step-By-Step Framework

Business leaders and team members face countless decisions every day, some of which are certain to have an impact on the future of a company. Perhaps the most impactful to a SaaS company’s bottom line are financial decisions related to the cloud. Engineers and team leads need to know which cloud architecture choices are worthwhile and which should be scrapped in favor of a more cost effective model.

Introducing StackState 4.6: Harnessing the Power of Topology + Telemetry + Traces + Time

Companies depend on observability insights to provide reliable online services to their customers. To support their efforts, StackState is proud to announce a new version of our unique topology-powered observability software, StackState v4.6, available now. This new version brings powerful new capabilities to DevOps and SRE teams who need to maintain a deep understanding of how their stack is behaving to meet their SLOs.

Securing The Software Supply Chain Linux Foundation Webinar

From the history of supply chain security threats to security development and deployment we've covered everything you’ve always wanted to know about the software supply chain but were afraid to ask. Dan Lorenc, Founder/CEO, Chainguard, Paddy Carey, Senior Staff Engineer, Cloudsmith, Adil Leghari, Solutions Architect Manager, Cloudsmith and Dan McKinney, Developer Relations, Cloudsmith, gathered for a fireside chat to cover your most burning questions.

Gaps in Kubernetes Adoption Data

The Cloud Native Computing Foundation (CNCF) recently released its annual survey on the state of Kubernetes and containers. The report highlighted the tremendous and continued growth in Kubernetes adoption, as well as some challenges that still persist. Both of these takeaways mirrored the corresponding data points from our 2021 Kubernetes in the Enterprise: Annual Report. However, as we dug into the data, we found gaps, or contradictions, between the two reports.

Introducing Epinio 0.6: Smaller, Faster *and* More Capable!

With our latest releases of Epinio, we’ve focused on making both the setup and developer experience much more streamlined. We’ve looked at where users are having issues and removed many of the roadblocks. This reduced footprint also allows for more customizability and easier long-term maintenance. If you are not familiar with Epinio, it is an application development engine for Kubernetes that lets you go from code to URL in a single step.

FireHydrant is now free for small teams

We envision a world where all software is reliable, and today we’re making that vision more of a reality for small teams. Available today, our new Free Tier helps smaller teams wrangle their reliability challenges with our enterprise-grade Incident Management, Service Catalog, and communications products. Our new package also has every feature that makes FireHydrant great with generous limitations.

What is CICD Pipeline? Explanation of CICD Pipeline along with Examples.

Continuous Integration(CI) is a software development practice where developers frequently merge the code and the changes in a central repository. The important goals of continuous integration is to find and resolve the bugs more quicker, improve the software quality, and reduce the time taken to validate and release new software updates. Continuous Delivery(CD), which is done on the top of Continuous Integration and includes the practice of automating the entire software release process and builds.

Reel in your activities: announcing cancellable activities and crons

Feature announcement: You can now cancel Platform.sh activities through the CLI and management console. Last year we released activity scripts, custom scripts that you can upload to your projects to run in response to any project or environment activity. In March we announced parallel activities, a queue that allows two simultaneous processes across your environments. Today we’re announcing another change to your activities—you can now cancel them.

Deployment Frequency Explained

While metrics have always been fundamental to improvement in the business world, the growing prominence of DevOps in recent years has elevated their importance in the context of software development. To build a continuous improvement culture, you need a set of metrics that allows you to establish a baseline and inform where the improvement opportunities lie. Arguably the most popular of them is DORA metrics. In this post, we will focus on Deployment Frequency, one of four DORA metrics.

Rolling out Roles

We’ve been pretty lucky at incident.io to be able to avoid dealing with more complex authentication issues for quite a while, because we piggy-back on Slack to know who you are and which organisation you work in. Whole companies have been built around doing authentication and user profiles really well, so it was pretty neat to be able to avoid doing most of that work for so long!

When to hire an Incident Commander

What comes to mind when you hear the term 'incident commander'? You are not alone if you think about fancy, tri-cornered hats, well-polished shoes, and a uniform weighed down by medals. The roles of incident commander, incident manager, or technical escalation manager have been typical in large organizations but are gaining popularity in smaller companies. For the purposes of this article, we will use the term 'incident commander,' but any of the above titles could work.

How to Implement Global View and High Availability for Prometheus

Ensuring that systems run reliably is a critical function of a site reliability engineer. A big part of that is collecting metrics, creating alerts and graph data. It’s of the utmost importance to gather system metrics, from several locations and services, and correlate them to understand system functionality as well as to support troubleshooting.

Shifting Left for DevSecOps Success

Catch this session to see exactly what does “shift left” security mean? More importantly, how does this strategy affect a developer’s workflow? In this workshop we walk attendees through the steps of setting up an end-to-end DevSecOps solution to automate your build artifact storage, vulnerability detection, testing, and deployment. Lastly, attendees learn how to take advantage of JFrog’s IDE integration and JFrog XRay to increase your confidence in the security of your application, all within a freely available DevSecOps environment!

Platform Engineering teams are the developer's cloud provider

Organizations rely more than ever on their engineering teams to get in front of their customers. Quickly delivering the latest functionalities to end-users in a reliable way can make or break a company these days. This need raises the pressure on engineering to deliver a scalable platform, rollout application updates faster, and manage applications efficiently once in production.

What Does AIOps Mean for SREs? It's Complicated.

If you’re an SRE, you might view AIOps with great excitement. By automating complex workflows and troubleshooting processes, AIOps could make your life as an SRE much easier. Alternatively, SREs may choose to view AIOps with disdain. They might think of AIOps as just a fancy buzzword that doesn’t live up to its promises, and that can become a distraction from the SRE tools that really matter. Which perspective is right?

Predict the cost of IP ranges with new enhancements to the Resources tab

One of our most requested and popular features, IP ranges for the Docker executor, recently became available to all customers on a Performance or Scale plan. With IP ranges, you can route job traffic through an IP address that is verifiably associated with CircleCI. This enables your team to meet compliance requirements by limiting the connections that communicate with your infrastructure. With any new feature, you want to know how much it’s going to cost your team.

Running Serverless Applications on Kubernetes with Knative

Kubernetes provides a set of primitives to run resilient, distributed applications. It takes care of scaling and automatic failover for your application and it provides deployment patterns and APIs that allow you to automate resource management and provision new workloads.

Deploying Docker Containers on AWS: Elastic Beanstalk vs ECS vs EKS

Containerization packages a software component and its environment, dependencies, and configuration into an isolated unit called a container. That makes it possible to deploy an application consistently across different computing environments, whether on-premises or on the cloud. The concept of containerization is more than a decade old.

Civo Update - March 2022

In February we had our first online meetup of the year, 'Connecting and securing your microservices by using EnRoute.' Check it out on our YouTube channel if you missed it. Meanwhile, for Civo Shorts, David Flanagan of Pulumi explains why Civo is his service provider of choice for testing environments. Plus guides and tutorials on all things Cloud Native and Civo. Read on.

Serverless Architecture: Pros, Cons, and Examples

Serverless Computing, or simply serverless, is a hot topic in the current software market. More and more companies are shifting their operations from traditional server-oriented architecture to faster, more modular serverless architecture. The “Big Three” cloud vendors (AWS, GCP, and Microsoft Azure) have shown immense interest in offering the best serverless experience possible. But what exactly is serverless? And how does it work if there is no server at all?

Introducing DevOps to the US Government - Part 2

In the first post in this series, I talked about the challenges for the US Government sector when attempting to introduce DevOps. The sector lags behind others such as Financial Services on every measure, yet the technical obstacles like a disruption to workflows and a lack of appropriate skills are the same.

How to aggregate your Metrics using MetricFire

This article covers such a popular topic as using aggregation rules for metrics. We will learn why it is important to use aggregations and what tools exist for working with them. Also, we will explore all the benefits of using MetricFire's Hosted Graphite solution to store, process, analyze and monitor your metrics.

Top 12 Kubernetes Risks

What’s putting your K8s workloads at risk? You probably didn’t immediately think of memory and CPU resources—yet, these pose significant threats to cost and performance in your public cloud Kubernetes and OpenShift deployments. Learn about the top 12 K8s risks and how you can visualize the spread of risk in your containers deployment. You'll also hear a methodology for drilling down to individual misconfigurations and resolving them.

Shifting Left for DevSecOps Success

Not long ago, developers built applications with little awareness about security and compliance. Checking for vulnerabilities, misconfigurations and policy violations wasn’t their job. After creating a fully-functional application, they’d throw it over the proverbial fence, and a security team would evaluate it at some point – or maybe never. Those days are gone – due to three main shifts.

Hot Storage vs. Cold Storage

When it comes to data storage, all data isn’t equal. After all, the data you use daily doesn’t need the same level of protection or ease of access as long-term hot storage vs. cold storage backup. A large percentage of a business’ data remains unleveraged due to data management and security challenges, which highlights the need to implement a data storage strategy.

Building Digital Platforms for Adaptive Resilience: Looking Inside Gartner Predicts 2022

With digital service expansion putting pressure on IT, 'Gartner Predicts 2022: Build Digital Platforms for Adaptive Resilience’ is a helpful guide for I&O leaders with their sights on 2025. If you looked up “tech trends” right now, how many search results would you expect to see? 100 million? 500 million? Think again. Between blog posts, research reports, and news articles, you’d actually find roughly 1.5 billion search results.

Enable FIPS on Google Cloud

Cyber attacks present an imminent threat to our digital assets. And they come in a variety of ways, including computer viruses, Denial-of-service (DoS), hacking, ransomware, memcached. In February 2022, White House deputy national security adviser for cyber and emerging technology Anne Neuberger claimed that the Russian hackers conducted a DDoS attack on the Ukrainian banks and Ministry of Defense before their military attacks.

CircleCI acquires test intelligence platform Ponicode

Today we are pleased to announce that CircleCI has acquired Ponicode, a Paris-based AI engine for analyzing source code, with the goal to help developers produce better code in their local development environment. Ponicode caught our attention with their dedicated focus to helping developers handle their least favorite tasks — the toil surrounding writing code — such as authoring tests, commenting code, analyzing code quality, and more.

AWS Cost Categories Vs. Tags: How To Get True Cloud Cost Visibility

Although Amazon Web Services (AWS) announced cost categories a few years ago, many people still struggle to understand the difference between them and AWS tags. If this sounds familiar to you, you are not alone. This quick guide will walk you through AWS Cost Categories, how they work, and how they differ from tags in AWS.

11 DevOps Best Practices You Should Know to be More Productive

Software engineering teams are continually seeking methods to improve and speed up the software development process. DevOps, an engineering methodology that brings development and operations together, is one popular strategy. Development and operations teams are frequently isolated in traditional engineering companies, which can lead to conflict between these two vital arms.

CentOS 8 is end-of-life: Now what?

There were many reasons people came to use CentOS as an alternative Linux platform to Red Hat Enterprise Linux (RHEL). CentOS was originally built as a downstream release of RHEL, which was free to use without support. CentOS became the de facto standard for many organizations that did not want to use RHEL for production workload, since it’s basically the same thing, just rebranded.

What's the Best PUE Ratio for Data Centers?

Power Usage Effectiveness (PUE) is a ratio of the total amount of power used by a data center to the power delivered to IT equipment. The PUE metric was developed by The Green Grid to measure the overall energy efficiency of data centers, and it has been one of the most popular data center KPIs since its introduction in 2007.

A Conversation With Aaron Bertrand

Back in the early days of Microsoft SQL Server, database administrators had few resources to draw upon and learn from. In the ’90s, we had SQL Server Professional, edited by Karen Watterson. We had CompuServe forums for SQL Server and Sybase. We had Microsoft documentation, which meant three printed books with no books online.

Introducing Codefresh Software Delivery Platform

Enterprises need a solution that can keep pace with innovation. The Codefresh Software Delivery Platform brings together Argo Workflows, Events, CD, and Rollouts into a unified enterprise-grade solution that equips developers for continuous delivery with confidence while leveraging GitOps best practices.

DirtyPipe (CVE-2022-0847) - the new DirtyCoW?

A few days ago, security researcher Max Kellermann published a vulnerability named DirtyPipe which was designated as CVE-2022-0847. This vulnerability affects the Linux kernel and if exploited, can allow a local attacker to gain root privileges. The vulnerability gained extensive media follow-up, since it affects all Linux-based systems with a 5.8 or later kernel, without any particular exploitation prerequisites.

Enhanced monitoring for your Azure Logic App

Implementing a business process can be challenging because you typically need to make various services work together. Think about everything your company uses to store and process data. How do you integrate all these products? Azure Logic Apps gives you pre-built components to connect to hundreds of services. You use a graphical design tool to put the pieces together in any combination you need, and Logic Apps will run your process automatically in the cloud.

Zero Trust Network Access (ZTNA) vs VPN: the core evolution

According to Gartner, by 2023, 60% of enterprises will phase out their VPN in favor of Zero Trust Network Access (ZTNA). In this blog, discover the four key advantages of ZTNA vs VPN. VPN (Virtual Private Network) has been the dominant solution securing remote access for users and has been considered a good solution for almost three decades. VPN benefits included keeping data secure, protecting online privacy, and reducing bandwidth throttling.

Podcast: Break Things on Purpose | Alex Solomon & Kolton Andrus: Break it to the Limit

Time for a cross over! Today Page it to the Limit host Mandi Walls, DevOps Advocate at PagerDuty joins Julie for a special episode. In this two part episode, Julie and Mandi interview Kolton Andrus, co-founder of Gremlin and Alex Solomon, co-founder of PagerDuty. Each of them share the origins of their respective companies, how they build amazing cultures, and some of the fun anecdotes along the way.

Read: 2021 Gartner 2021 Market Guide for AIOps Platforms by Gartner

We are excited to be named a Representative Vendor in the domain-agnostic AIOps platforms market in the 2021 Gartner Market Guide for AIOps Platforms. We believe that this validates our unique approach to delivering an observability solution that accelerates your success with: Read our summary of this Gartner research report, below. If you haven’t noticed, AIOps is taking off.

15 features that make life simpler for web development agencies

As developers, we know how much work great software requires. We know that you need to focus on multiple things at once to release awesome, custom-made applications for your clients. We know the last thing you want to spend time doing is fiddling with servers and stressing about downtime. That's why we've made it possible to deploy your code directly from your repo to any cloud in just a few clicks.

Data Center Capacity: How to Measure, How to Plan, and How Much is Left?

Data center capacity refers to key data center resources (i.e., power, space, cooling, and power/network port connections) that are available to meet the requirements of current and future IT demand. Accurately planning and managing data center capacity is essential for maintaining uptime and increasing efficiency. Failure to do so can be very expensive and detrimental to the business.

Implementing a Kubernetes Application Platform - BambooHR and Shipa

In this webinar, we talk with platform engineering leaders at BambooHR, a SaaS leader in the Human Resources space, about furthering their journey into Kubernetes. We are joined by Kelsey Hightower to help moderate and provide commentary on what he has seen in the space. As BambooHR kicks off their journey with Shipa, learn from the prospectus of the team moving the needle in engineering efficiency and developer experience.

Elastic Observability 8.1: Visibility into AWS Lambda, CI/CD pipelines, and more

Technologies such as serverless computing frameworks and CI/CD automation tools help accelerate software development lifecycles (SDLC) to give development teams a competitive edge in the marketplace. Armed with these technologies, teams can deploy and innovate faster and more frequently by automating repetitive tasks and eliminating the need to manage or provision servers.

AppScope 1.0: Changing the Game for SREs and Devs

SREs and Devs are used to solving problems even when an awkward or inefficient way is the only way. In AppScope 1.0, SREs and Devs have a new alternative to standard methods, that the AppScope team thinks will make that problem-solving a lot more fun. We in the AppScope team constantly hear firsthand about life in the SRE trenches. For this blog, we “interview” a fictional SRE/Dev whose thoughts and comments are a mash-up of things we’ve heard from real people we know.

Virtualization Management: What It Is, What It Does, and How It Can Streamline Your Dashboard

Let’s say that you’re a real estate investor with lots of buildings in your portfolio. But you choose not to employ a caretaker when you fully know that you aren’t always available to monitor every building. What do you think will become of some of them?

Visma Tech Talk with Kosli's Mike Long - DevOps: The Beginning of Infinity

(Kosli - formerly known as Merkely) Visma Tech Talk with Kosli's Mike Long - DevOps: The Beginning of Infinity In this video Mike talks to Tinuis Alexander Lystad from Visma about his latest talk, DevOps: The Beginning of Infinity inspired by by David Deutsch. Mike has explored the understandings of infinite knowledge creation and what that means for the future of DevOps.
Sponsored Post

The Best Kubernetes Monitoring Tools

In this article, you'll learn about the best Kubernetes performance monitoring tools that are currently on the market. Although there are a number of application performance monitoring solutions out there, this article covers the best options in terms of their key features, functionalities, ease of setup, and the support garnered from each of their respective communities.

A guide to AWS Regions & Zones

Amazon Web Services (AWS) continues to extend its availability globally, with the AWS Cloud spanning 84 Availability Zones within 26 geographic regions around the world. In this blog, we look at how Network-as-a-Service (NaaS) platforms such as Console Connect can enhance the performance of your AWS assets and applications through direct and on-demand connections.

Effectively Bridging the DevOps - R&D Gap without Sacrificing Reliability

DevOps culture revolutionized our industry. Continuous Delivery and Continuous Integration made six sigma reliability commonplace. 20 years ago we would kick the production servers and listen to the hard drive spin, that was observability. Today’s DevOps teams deploy monitoring tools that provide development teams with deep insight into the production environment. Before DevOps practices were commonplace, production used to fail. A lot.

IP Wave: Understanding the Network is the Key to Guaranteed Service Delivery - Part 3 in the IP Wave Series

There’s little doubt that a lot has changed when it comes to network design, specification and build. The seemingly simple goal of carrying traffic from point A to point B might remain, but these days it comes with a long list of options and alternatives to cover a wide range of network architectures, content providers and service types, all of which might have different requirements for bandwidth, latency, and availability.

Enabling Service Providers to Rapidly Create and Deliver Innovative New Services through Automation, Optimization and Openness: Part 2 in the IP Wave Series

I am excited about the upcoming Optical Fiber & Communications (OFC 2022) Conference in San Diego for several reasons. Not only will it be the first in-person conference that I have attended in more than two years due to the pandemic, but the thing that I am most excited about is the fact that Ribbon will be showcasing our new IP Wave offering at the show.

Engineering's Role In Setting A Winning SaaS Pricing Strategy

Companies that effectively implement SaaS pricing and packaging have the distinct advantage of being able to provide their customers with a high level of value at a reasonable price. And, they’re able to do it in a way that is profitable and worthwhile for the business. The problem is that packaging and pricing SaaS products is anything but simple. There are a lot of variables involved in deciding on a SaaS product’s pricing strategy.

Use Your Load Balancer to Monitor Application Health

HAProxy and HAProxy Enterprise collect a vast amount of information about the health of your applications being load balanced. That data, which uses the Prometheus text-based format for metrics, is published to a web page hosted by the load balancer, and since many application performance monitoring (APM) tools can integrate with Prometheus, it’s likely that you can visualize the data using the APM software you already have.

Linux and embedded system: What you should know

Open-source software and embedded Linux? Ever-proliferating cybersecurity concerns? Get up-to-speed with the current status in the embedded landscape with this short video. And if you are longing for more and want to know why Linux is the OS of choice for embedded systems, check the ultimate guide to Linux for embedded applications. In this exclusive webinar, you will learn more about the embedded landscape, the IoT and how Ubuntu Core is raising the bar for embedded Linux.

Getting started with Latency attacks

As the world becomes more dependent on cloud-native systems, the tolerance for slow services is decreasing. Users expect instantaneous access to services, whether it's for work, entertainment, or even cloud infrastructure. Even small amounts of latency can significantly decrease user satisfaction: nearly half of all users expect web pages to load in under two seconds, and as many as 28% of users will permanently abandon a slow site.

Netdata Meetup: Real World Scenario on How to Install and Monitor from Scratch

In our first Netdata Meetup, Thiago Marques will present and show you how to install Netdata from scratch on a specific host and demonstrate how to understand navigating through the many, in-depth Netdata dashboards. Thiago will also cover understanding metric distribution. Monitoring is not only to visualize collected data, which is why we will show where host notifications are, and how to access A.I. to simplify even more the correlation between issues and hardware/software.

ServiceNow + Squadcast Integration: Automate IT Ticketing and Project Tracking

ServiceNow is a workflow automation platform used by organizations for their IT ticketing and project management needs. In contrast, Squadcast is an end-to-end incident management and SRE platform that is used by organizations for their reliability requirements.

DevOps 101: What, who, why, and how?

Organizations around the world are turning to DevOps as a way of working together to improve the efficiency and quality of software delivery and increase value to the business. But what exactly is DevOps and what does it mean for you and your organization? Welcome to DevOps 101. Our new monthly webinar series aims to explain exactly this. Each live session will be hosted by me, Grant Fritchey, starting with the fundamentals to expand your knowledge and understanding of DevOps.

Build and Deploy an Application with VMware Tanzu Community Edition

Follow along as Cora Iberkleid shows how to use open source tooling provided by VMware Tanzu Community Edition to move through the core steps of cloud native service delivery. You’ll see her leverage kpack to turn source code into container images; the Harbor registry to sign images, scan them for vulnerabilities, and provide secure registry services; and Knative Serving to simplify the deployment and running of services in a Kubernetes environment.

Step-by-Step Setup of VMware Tanzu Community Edition on AWS

In this video, Steve Pousty walks through the essential steps required to install VMware Tanzu Community Edition on AWS. Follow along as he provides useful tips and detailed explanations while he prepares his environment, then sets up both management and workload clusters on AWS from a Windows computer using WSL2. His clear presentation will help you get your own environment up and running quickly.

What SREs Can Learn from Capt. Sully: When to Follow Playbooks

When are you smarter than your playbooks, and when are your playbooks smarter than you? That’s a question that engineers rarely step back to consider. The rational, disciplined parts of our minds tell us that the playbooks we are supposed to follow were carefully designed and tested, and that we should stick to them at all costs.

Stop wasting your time! A modern development workflow for WordPress, using Platform.sh plus third-party tools

To quote my colleague, Chad, WordPress “remained tremendously popular since its release in 2003”. For many, WordPress remains by far the CMS that is easiest to adopt, and that provides a fast time to market in the majority of use cases. There is so much high-quality material out there for WordPress, be it OSS or Premium, that one can have beautiful sites powered by an easy-to-use CMS up and running in no time.

You don't need to migrate from Jenkins. Start building beside it.

Ten years ago, tools like Jenkins were first-class automation platforms for your CI pipelines. The jump from lower-level tools and custom scripts to tools like Jenkins created dramatic improvements. Now, a new generation of web-based tools are available. They provide a platform for the next leap forward for product build automation. This long history means that many mature organizations use Jenkins for CI.

Kubernetes vs Nomad: What to Choose in 2022?

Kubernetes has become an enormously popular choice for containerized applications since its 2014 launch. Many software developers rely on the tool, which is now in v1.23.1. However, there are other choices on the market for container orchestration. One such tool is Nomad, originally launched in 2015. Generally pitched as an alternative to Kubernetes, Nomad, which most recently released v1.2, promotes itself as a simple, flexible option for software teams.

Most-Loved Open Source Tools: Free solutions recommended by IT Pros

The best free and open-source software are tools that users simply cannot live without — they make everyday tasks on Windows, Mac, and Linux easy without any of the associated costs or licensing fees that come with pay-to-play solutions. For some quick background, open-source software took off during the earlier days of IT in the late 1990s and has changed the world ever since.

Incident Response Lifecycle | A Complete Explanation

Wondering about the incident response lifecycle? We explain what it is, and how each phase helps lead to effective incident resolution. What is the incident response lifecycle? The incident response lifecycle is an organization’s framework for responding to an incident that disrupts service. The incident response lifecycle contains the following phases.

Monthly Moo March 2022

What a start to 2022 has been for us all. We are incredibly proud of the continuous innovation, velocity and delivery of new features and functionality. We’ve heard success story after success story from our brilliant customers, each unique in their own way and continue to collaborate with them on our roadmap. So, this March update is for you and a massive thank you. We couldn’t do it without you, and it’s been our honor to be part of your success.

Troubleshooting CircleCI webhooks

CircleCI webhooks open up a variety of exciting use cases, from data logging and integrations with third-party monitoring and observability solutions to setting up your own custom dashboards to monitor pipeline health. To ensure that you can properly monitor events, resolve authentication errors, and also access the information contained within them, you need a reliable process to debug any errors you might encounter.

OpenStack is dead? The numbers speak for themselves.

Austin, Texas – 3rd of March 2022 – OpenStack is dead! A masked man in a black cloak with “public clouds”, “containers” and ”serverless” inscriptions shot OpenStack straight in the heart. OpenStack fell to the ground and with the last moment of strength exclaimed: “Long live open infrastructure”! That could be a headline of a tabloid, would you agree? OpenStack is dead. We’ve all heard about that. It’s gone. It’s abandoned.

What a delayed SD-WAN or SASE decision could cost you

Winners make good decisions fast, execute them quickly, and see higher growth rates and/or overall returns from their decisions. That’s according to a McKinsey study, ‘Decision-Making In The Age Of Urgency’. But the same study also pointed out that ineffective decision-making has significant implications for company productivity today. On average, survey respondents said they spent 37 percent of their time making decisions.

4 compelling reasons why you need a network discovery tool and 5 ways OpManager helps

Businesses now scale exponentially and so do their networks. Managing a hybrid IT environment that comprises wired, wireless, and virtual networks can be a challenging task for network administrators. However, continuous monitoring of these devices for fault and performance is crucial. Network discovery is key to successful monitoring solutions.

Sponsored Post

ITOps vs. SecOps vs. DevOps vs. DevSecOps

ITOps, SecOps, and DevOps may sound similar. Indeed, they are similar - to a degree. But they have different areas of focus, different histories, and different operational paradigms. Keep reading for an overview of what ITOps, SecOps, and DevOps mean and how they compare. We'll also explain where DevSecOps fits into the conversation - and why you shouldn't worry so much about defining these terms perfectly as you should about finding ways to operationalize collaboration between your various teams.

Sponsored Post

Golden Signals - Monitoring from first principles

Building a successful monitoring process for your application is essential for high availability. In the first of this three-part blog series, Safeer discusses the four key SRE Golden Signals for metrics-driven measurement, and the role it plays in the overall context of Monitoring. Monitoring is the cornerstone of operating any software system or application effectively. The more visibility you have into the software and hardware systems, the better you are at serving your customers. It tells you whether you are on the right track and, if not, by how much you are missing the mark.

Amplify Artifactory and Distribution Changes Through PagerDuty

When automated software delivery runs smoothly, it can whisper, and quietly attend to itself. But when your delivery and distribution pipeline runs into a problem, it must shout. Boosting the volume of Artifactory and Distribution change events and issues through PagerDuty can help ensure they’re heard by everyone whose job it is to monitor your software delivery pipeline.

Kubernetes Health Check Using Probes

Kubernetes is an open source container orchestration platform that significantly simplifies an application's creation and management. Distributed systems like Kubernetes can be hard to manage, as they involve many moving parts and all of them must work for the system to function. Even if a small part breaks, it needs to be detected, routed and fixed. These actions also need to be automated. Kubernetes allows us to do that with the help of readiness and liveness probes.

Customer Panel: CDI and PROACT Discuss Observability Platforms for MSPs

In this customer panel video Chris Black, CTO Managed Services at CDI, and Per Sedihn, CTO & VP Portfolio & Technology at ProAct, talk about their partnerships with LogicMonitor. Topics include the importance of predictability, automation, and intelligence in observability platforms for Managed Service Providers, specific ways LogicMonitor helped both companies scale quickly, and the future of support for hybrid infrastructures with unified observability.

Automating database cleanup with scheduled pipelines

RESTful API projects often require that developers grant temporary access to a particular resource. Sometimes this happens within a specific interval, such as a few days or months. Revoking permissions when they expire could mean including extra logic during the authentication process or writing a middleware function to attach to the secured endpoint. Or, this logic could be abstracted to a separate part and configured to check and manage permissions at a regular interval.

How to Deploy Mattermost on AWS Via Opta

A common denominator that Mattermost and most corporate applications share is the challenge users can face in successfully setting up a self-hosted instance in their own cloud account. Even with cloud-specific documentation, there’s almost always a hard requirement of understanding said documentation, resolving any errors encountered along the way, and maintaining the application.

Mastering Digital Operations Across the Enterprise

I’m excited to announce that today, PagerDuty is taking our automation capabilities to new scale and scope as we enter into a definitive agreement to acquire Catalytic. With their technology and talented team we accelerate the delivery of enterprise-wide process automation that manages no-code workflows across the business, broadly applicable to any workflow, for any employee.

7 CloudCheckr Alternatives And Competitors

With CloudCheckr, users can manage the cost, security, and compliance of their public cloud. It also offers a VMware-to-AWS migration service. But the platform's multi-level pricing structure can confuse some customers. Others feel CloudCheckr is overkill for their current cloud adoption stage. This quick guide walks you through cloud management tools like CloudCheckr.

Postmortems Now Called Retrospectives in Blameless

Something big happened at Blameless this month — our “Postmortem” feature was updated to its new name, “Retrospective”. To the naysayer, I suppose you’re thinking, This seems trivial. Different teams call it different names anyway, so why bother making the change? First let me say, thank you for reading our blog and I hope you finish this one through to the end. Now, allow me to explain our reasoning and why we’re excited about this update.

Shipa Cloud Operations and Practices

Shipa Cloud is how we run the Shipa control plane on behalf of users in order to give them the fastest path possible to implementing Application as Code within their clusters. You can try out Shipa Cloud for free today by going to shipa.io. Besides being the fastest way possible to get started with Shipa, it also takes away the responsibility of upgrades, maintenance, and uptime of the control plane for our users, but that responsibility doesn’t just disappear.

Advantech and Canonical Collaborate on Ubuntu Pre-Loaded Embedded Solutions for Edge Computing Applications

March 2nd, 2022 — Advantech, a leading provider of embedded IoT solutions, has collaborated with Canonical in the provision of Ubuntu pre-loaded and certified embedded boards and systems for diverse edge computing applications. By certifying Advantech products, Canonical, the publisher of Ubuntu, guarantees up to 10 years of Linux security and update capabilities for users in the AI robotics, industrial manufacturing, and mission-critical application sectors.

Infographic: Achieving True Observability With the 4Ts

Ready to “rewind the movie” to see exactly what was going on in your stack at any moment in time? Ready to quickly go straight to the original source of the problem to solve issues faster? Check out our new infographic, Achieving True Observability With the 4Ts, to see how StackState’s unique 4T® data model correlates topology, telemetry and traces at every moment in time, to deliver real-time contextual insights into your entire IT landscape.

Integrating Azure Key Vault With AKS Cluster

I recently had the pleasure of presenting a webinar with Microsoft Reactor. It was on implementing Azure Key Vault (a centralized place to manage all of your highly sensitive information on Azure). In this webinar, I share a step-by-step demonstration of how to integrate your information with the AKS cluster. The goal is to implement a solution that will allow an integration between Azure Key Vault, where I will store all my secrets; and my AKS, where I will use them.

Sponsored Post

Simplifying ESX monitoring with OpManager

IT admins have to adapt to new market trends and networking concepts to meet ever-evolving IT demands. However, solely relying on physical components to support this changing landscape puts them at a disadvantage when it comes to scalability, network distribution, and cost-effectiveness. To remove the strain on physical components such as servers and to keep the capital expense in the optimal range, IT admins rely on virtualization. Most networks have started adopting virtualization even for their most resource-intensive applications.

One Minute to Deployed on Kubernetes with Shipa

In this Shipa Shorts video, we deploy to Kubernetes in under a minute. All we had to produce is an image and Shipa takes care of the rest. No need to wire Networking Policies, Service Meshes, etc. With Shipa, you can deploy to Kubernetes without having to understand Kubernetes internals. Outside the UI, no matter your flavor of CI/CD, Shipa supports that.

Cross-functional collaboration: Why marketing is perfectly placed to set the example in a distributed, hybrid world

Running successful cross-functional projects, when teams are no longer co-located in the same physical space, requires a different approach. Read on to find out how we used this as an opportunity to improve how the Marketing Division at Redgate collaborates with other parts of the business and how you can apply some of our learnings in your own organization.

Adding Super Fast Frontend Search in Rails with Lunr

This is the first part of a multi-part post focusing (mostly) on front end search and Command Palettes. If you are not familiar with Command Palettes, they are a power-user's dream: a universal overlay on your webpage that's triggered with a key shortcut (usually Command + K) and allows your users not only to search the content but also perform actions on your website. The goal here is to "keep the user's hands on the keyboard" (and away from the mouse), when using your application.

Cycle Podcast | Episode 11 "Rising Cloud + Cycle.io" Featuring Sean Brown, Chief Product Officer

In this episode, Jake Warner chats with Sean Brown of Rising Cloud. Discussions include some background and an overview on Rising Cloud, Stateful vs. Stateless, and how both companies are coming together to solve developer-centric needs.

Alert Fatigue in SRE: What It Is & How To Avoid It

Wondering about alert fatigue? We describe what it is, how it affects software development teams, and how to avoid it. What is alert fatigue? Alert fatigue is the phenomenon of employees becoming desensitized to alert messages because of the overwhelming volume they receive, and the number of false positives they receive. The risk with alert fatigue is that important information will be overlooked or ignored.

JFrog Discloses 5 Memory Corruption Vulnerabilities in PJSIP - A Popular Multimedia Library

JFrog’s Security Research team is constantly looking for new and previously unknown security vulnerabilities in popular open-source projects to help improve their security posture. As part of this effort, we recently discovered 5 security vulnerabilities in PJSIP, a widely used open-source multimedia communication library developed by Teluu. By triggering these newly discovered vulnerabilities, an attacker can cause arbitrary code execution in the application that uses the PJSIP library.

4 Tips To Improve Your Company's Cloud Visibility

The struggle for better cloud visibility is common amongst software-as-a-service (SaaS) companies. SaaS applications are more complex than on-premise solutions by nature — there is a lot more surface area to consider. Distributed and multi-tenant systems are inherently more complex. Distributed systems, as often found in cloud architectures, are generally scaled out horizontally much more so than traditional single instance applications.

How Your Cloud Application Monitoring Data Can Help You Make Clearer Business Decisions

Companies are increasingly adopting cloud services. They’re cost-efficient, scalable, and give you time to market by getting your ideas out there as quickly as possible. However, cloud services don’t have extensive, in-built cloud application monitoring. This is where Netreo steps in. Let’s take a look at what cloud computing is. After that, we’ll dive into cloud application monitoring and how it affects business decisions.

5 questions about Ansible that Elastic Observability can answer

While automating systems is seen as an imperative in boardrooms around the globe, automation teams — the teams on the ground — often lack the data to help them to industrialize their automation efforts and move from ad-hoc automation to strategic automation. In this automation-focused blog post, we will show how to instrument infrastructure automation with Elastic Observability.

Cloud Governance: What It Is and Why You Need It

Every company exerts some level of effort to manage costs, performance, and risk in their hybrid cloud environment. But to ensure that those activities are performed consistently and efficiently across the board, you need a framework of policies, processes, controls, and tracking. In other words, you need cloud governance.

Enabling simple, cost-effective Kubernetes on IBM Z with MicroK8s

Containerisation has transformed the enterprise IT landscape, driving faster, more secure, and more predictable software delivery than ever before. Thanks to technologies like Docker, building containerised applications is easy, and many businesses are working with hundreds or even thousands of containers. To effectively deploy and manage all of these microservices, a container orchestration tool is essential, and Kubernetes is the leading solution.

Pacific Textiles drives digital transformation with new infrastructure for legacy and cloud-native applications

March 1st, 2022 — Canonical announces that Pacific Textiles chose Canonical Charmed OpenStack for its infrastructure upgrade. The new private cloud environment allows keeping the legacy workloads running continuously and launching new cloud-native services simultaneously. The professional services team from Canonical helped Pacific Textiles during this migration, from making architectural choices to launching the live cloud.