Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Cloud monitoring, security and related technologies.

Monitoring vs Observability: Can You Tell The Difference?

Monitoring vs observability – is there even a difference and is your monitoring system observable? Observability has gained a lot of popularity in recent years. Modern DevOps paradigms encourage building robust applications by incorporating automation, Infrastructure as Code, and agile development. To assess the health and “robustness” of IT systems, engineering teams typically use logs, metrics, and traces, which are used by various developer tools to facilitate observability.

Centralized Log Management and Cloud Environments

Even before new hybrid workforce models, many companies already moved a lot of services to the cloud. COVID-19 digital transformation strategies instantly increased the number of access points and endpoints. This led to a rapid increase in event log data followed by all kinds of other issues -- performance, availability, security, and ultimately increased IT costs amongst other things. A centralized log management solution for your cloud environment can help you manage the above and more.

Introducing Cloud Cost Intelligence for Snowflake

Here at CloudZero, we work with some of the top software-driven companies out there. Like us, they’re building their products on Amazon Web Services (AWS), along with whatever best-of-breed providers meet their needs. It’s no secret that in recent years, Snowflake has seen — well, some serious success. For many companies, including CloudZero, they're the data warehouse provider of choice — and an essential component of delivering their products.

Getting to Know Google Cloud Audit Logs

So you've set up a Google Cloud Logging sink along with a Dataflow pipeline and are happily ingesting these events into your Splunk infrastructure — great! But now what? How do you start to get meaningful insights from this data? In this blog post, I'll share eight useful signals hiding within Google Cloud audit logs that will help you uncover meaningful insights. You'll learn how to detect: Finally, we’ll wrap up with a simple dashboard that captures all these queries in one place.

How to monitor your AWS servers via MetricFire

In this article we explore the basics of monitoring Amazon Web Services (AWS) by feeding metrics to Grafana through Hosted Graphite’s agent and also through Hosted Graphite’s AWS add-on. This will allow us to monitor metrics from applications and servers hosted in AWS with clarity and depth. This article assumes you have created a Hosted Graphite account.

How to set AWS S3 Bucket Read Permissions with Relay

Cloud environments are susceptible to security issues. A big contributor is misconfigured resources. Misconfigured S3 buckets are one example of a security risk that could expose your organization’s sensitive data to bad actors. Policies and regular enforcement of best practices are key to reducing this security risk. However, manually checking and enforcing security is time-consuming and can fall behind with all the demands a busy DevOps team faces every day.

Distribute Software Releases Globally with JFrog on AWS

Release management is a topic that leaders in DevOps teams should be concerned with as organizations move toward implementing systems of automated continuous deployment. The practice will make your organization more efficient, but how do you implement it? Modernizing your infrastructure for the cloud is essential to distributing trusted releases globally. Many enterprises choose AWS as their platform, and seek to use AWS edge services to distribute their production applications to nearby end-points.

AWS Step Functions Input and Output Manipulation Handbook

In this handbook, we’ll explain the AWS Step Functions Input and Output manipulation. There’s plenty to talk about AWS Step Functions. There are numerous articles available online talking about AWS Step Functions ever since Step Functions were introduced in 2016. Most of these articles might make you think that Step Functions are actually an extension of the Lambda function, allowing you to combine several Lambda functions to call each other.

Take the first step toward SRE with Cloud Operations Sandbox

At Google Cloud, we strive to bring Site Reliability Engineering (SRE) culture to our customers not only through training on organizational best practices, but also with the tools you need to run successful cloud services. Part and parcel of that is comprehensive observability tooling—logging, monitoring, tracing, profiling and debugging—which can help you troubleshoot production issues faster, increase release velocity and improve service reliability.