Operations | Monitoring | ITSM | DevOps | Cloud

Cloud

The latest News and Information on Cloud monitoring, security and related technologies.

Monitor and troubleshoot your VMs in context for faster resolution

Troubleshooting production issues with virtual machines (VMs) can be complex and often requires correlating multiple data points and signals across infrastructure and application metrics, as well as raw logs. When your end users are experiencing latency, downtime, or errors, switching between different tools and UIs to perform a root cause analysis can slow your developers down.

Orchestration in Telcos: the multi-vendor and multi-cloud environments...

The use of NFV migration is becoming commonplace, it is made apparent there is a need for a higher degree of software management, smoother upgrades, and deployment process. Due to the complexity of the migration, Telcos have been deterred from adoption. A solution should be out there to aid businesses in managing and deploying network automation, orchestration, and managed services. In general, a telco network is complex and needs to be managed using multiple perspectives.

Distributed tracing with OpenTelemetry and Cloud Trace

As more services are involved in serving user traffic and completing transactions, how does each service contribute to overall latency? In this episode of Engineering for Reliability, we’ll show how to use distributed tracing to capture the latency of user requests and how long it takes each service in the path to return a response. Watch to learn how to capture latency in distributed applications using OpenTelemetry and analyze it using Cloud Trace.

Product Explainer Video: Splunk Infrastructure Monitoring for Real-time Monitoring in the Cloud

Wherever you are in your cloud journey and whatever your environment looks like, Splunk Infrastructure Monitoring is a purpose-built metrics platform to address real-time cloud monitoring requirements at scale. Get real-time observability for data from any cloud, any vendor, and any service.

Google Cloud Asset Inventory 101

Cloud Asset Inventory is a metadata inventory service that allows you to view, monitor, and analyze all your Google Cloud and Anthos assets across projects and services. In this video, Sophia Yang - a Google Cloud Product Manager - will show you how Cloud Asset Inventory allows you greater visibility into your Google Cloud assets, receive real-time notifications on asset config changes, run analysis on inventory, getting insights from your deployment, and more! Watch to learn how you can use Cloud Asset Inventory to gain greater observability into your Google Cloud and Anthos assets!

What Is AWS Auto Scaling? And When Should You Use It?

Whether you’re growing rapidly and need to expand your infrastructure or demand has slowed and you need to scale down, AWS Auto Scaling can help. While manual scaling is time-intensive and costly, auto scaling is an automated process that adjusts capacity for predictable performance and costs. Auto scaling can help you optimize how your application is used and reduce wastage and optimize cloud spend.

Dashbird Explained: the why, what and how

Here’s everything you need to know to get started with Dashbird – the complete solution for End-to-End Infrastructure observability , Real-time Error Tracking, and Well-Architected Insights. When working with AWS, One cannot emphasize enough the architectural best practices for designing workloads. One of those best practices is to design the solution in such a way that the monitoring of infrastructure and troubleshooting of errors and problems is achieved effortlessly.

Going Beyond with Hybrid Cloud using CloudHedge - The Best of Both Worlds

Lately, enterprises are moving towards a hybrid solution that offers the best of both worlds. A hybrid cloud setup combines two infrastructures like a private cloud with one or more public cloud further enabling communication between each distinct service. To maximize returns, a hybrid cloud strategy equips the enterprise with greater flexibility and control by moving workloads between clouds as costs and resources fluctuate.

Troubleshoot GKE apps faster with monitoring data in Cloud Logging

When you’re troubleshooting an application on Google Kubernetes Engine (GKE), the more context that you have on the issue, the faster you can resolve it. For example, did the pod exceed it’s memory allocation? Was there a permissions error reserving the storage volume? Did a rogue regex in the app pin the CPU? All of these questions require developers and operators to build a lot of troubleshooting context.