Operations | Monitoring | ITSM | DevOps | Cloud

We've just launched Alert Sync 1.5 and it's even more functionally fantastic than before!

So, what shiny new functionality have we added for you to enjoy? As well as all the great stuff Alert Sync did before, you can now benefit from even more features: Wait Rules allow an incoming SCOM alert to be held for a specified period of time; before being evaluated against Incident Creation rules. This is really useful for those incidences when a SCOM alert might open and close itself in quick succession (like a CPU usage threshold monitor).

HAProxyConf 2019 - Building a Service Mesh at Criteo with Consul and HAProxy by Pierre Souchay

At Criteo, we have been working on building a tight integration between HashiCorp Consul and HAProxy. In this talk, we will explain how we provision our HAProxy instances dynamically using Consul Connect, a new service mesh technology that allows HAProxy to talk to its peers from machine-to-machine without a traditional load balancer. We will detail how we are able to create a service with DNS, add load balancing, and configure SSL certificates in mere seconds. Since Criteo is working actively on enabling HAProxy with Consul Connect, we will explain the challenges of scaling Service Mesh architectures for large infrastructures.

Episode 10: Installing Redis from Ansible Galaxy

A pre-built playbook from Ansible Galaxy lets us easily install Redis. (Even we don't re-invent the wheel every time.) The Request Metrics application will use Redis as its main data store. We need to install Redis on our servers to find out if this is a good plan. Ansible provides a repeatable way of doing this configuration work.

Searching Zendesk: Elastic Workplace Search for customer service organizations

We’re excited to announce that Zendesk is now available as a pre-built content source, along with a host of others, as part of the Workplace Search application. With more than 130,000 customers in 30 countries, Zendesk has become one of the de facto customer service platforms in the world. Each day, millions of users interact with support agents via the cloud-based tool regardless of the support channel they choose.

Feature importance for data frame analytics with Elastic machine learning

With Elasticsearch machine learning one can build regression and classification models for data analysis and inference. Accurate prediction models are often too complex to understand simply by looking at their definition. Using feature importance, introduced in Elastic Stack 7.6, we can now interpret and validate such models.

How AI Helps IT Ops Pros Work Remotely

While the COVID-19 pandemic reshapes work processes, digitalization is allowing businesses to adjust to the fluid situation. The deployment of AI in IT operations is a good case study of this. Human beings’ social dimension needs cultivation. Otherwise, people become unhappy and perform ineffectively. Beyond that, many tasks require social interaction to be executed successfully, including in IT operations.

Four immediate benefits you will gain from a modern monitoring platform

Cloud applications don’t just run flawlessly by way of magic. Many things can go wrong, and rest assured some will go wrong at one point. For small teams, this can be cumbersome and take a toll at the development speed. A monitoring system will detect these issues on behalf of the development team, so that they can act accordingly. At Dashbird, we think there’s much more to it, though, than just detecting and alerting issues, especially for small teams of developers.

Q&A with Alex Hidalgo on SLOs

Alex Hidalgo is a Site Reliability Engineer at Squarespace, and he’s currently writing a book called Implementing Service Level Objectives for O’Reilly Media. The first three chapters of the book are available now through O’Reilly’s early access program. I had a chance to read those chapters and ask Alex some questions about service level objectives and reliability. Thanks, Alex, for sharing your knowledge.