VictorOps

victorops

Head in the Clouds: Integrating Oracle Cloud Infrastructure With VictorOps

In tandem with DevOps adoption, the use of cloud computing and infrastructure continues to grow. Teams are using public cloud, private cloud, on-premises and hybrid cloud applications and infrastructure to improve scalability and speed. In a world of CI/CD and rapid deployment, cloud infrastructure offers more transparency in releasing software and greater flexibility in production.

victorops

Securely Using Secrets: A Template for Using HashiCorp Vault

According to a recent study by researchers at North Carolina State University, over 100,000 publicly accessible GitHub repositories contain exposed application secrets directly within their source code. From private API tokens to cryptographic keys, this study – which only scanned approximately 13% of GitHub’s public repositories – indicates that properly securing application secrets is one of the most neglected methods of information security in software today.

victorops

Making the Most of Machine Learning for Incident Management

With today’s level of automation in software development and delivery, along with continuous improvement to CI/CD pipelines, keeping up with production needs becomes harder and harder. So, how can you improve release management and deploy new features and services at breakneck speeds without hindering service reliability?

victorops

The Importance of Regular Application Health Checks and Monitoring

Application health is at the core of your business. Lost revenue and negative customer experiences come with application or service downtime. Regular application health checks and effective application monitoring will allow you to detect issues before they become full-fledged outages. If you’re not focused on maintaining resilient applications, you’re not focused on the business’ bottom line.

victorops

Deploying Code the Smart Way: Minimizing Downtime When Delivering Frequent Changes

As time passes and CI/CD is adopted more and more, many software development organizations are moving toward the approach of developing applications within shorter development cycles. These shorter cycles and more efficient releases allow organizations to get new application features and other modifications out the door faster than ever before. While this approach brings with it a great many positives, it’s not without challenges.

victorops

The On-Call Scheduling Template for DevOps and IT

Incidents are inevitable. Effective DevOps and IT teams aren’t just proactively addressing the reliability of technical systems, they’re preparing for unforeseen circumstances. You can’t continuously deliver new features and services without taking on some unknown risks. So, how can the on-call team ensure full coverage and better prepare themselves for incident response without creating a culture of burnout and alert fatigue?

victorops

The IT Operations Manager's Guide to Alerting

One key rule in IT operations management includes knowing when (and why) something breaks and how to resolve those incidents. The best IT operations managers will completely control the incident management lifecycle – from real-time incident detection to future incident preparation. IT operations managers act as the essential connection between technical systems and the people who use them.

victorops

Adding an Incident Response Framework to Your CI/CD Pipelines

Security breaches aren’t usually polite, well-behaved events. They show up unannounced, do things you don’t want them to do and can leave an unpleasantly large amount of damage in their wake. This is as true in DevOps as it is in more traditional methods of software development and delivery. In order to manage security-related incidents and to limit the amount of damage they cause, you need to set up a framework for incident response.

victorops

Efficient Incident Management With Automated Call Routing

Being on-call in DevOps and IT means being available to both customers and employees in a number of different ways. You need to be available for real-time incident remediation and open to contact from nearly any channel (e.g. Slack, SMS, phone calls, email, etc.). If a server crashes and customers experience downtime, every second costs money and results in lost business.