Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

Blameless CommsAssist - 3 Tips on Making Incident Communication Easy

When you’re in the thick of an incident, communication is both essential and challenging. A wide variety of stakeholders will need timely updates on the situation in order to respond effectively. At the same time, breaking away from the actual diagnostic and resolving work to send these updates can massively slow progress.

The Catchpoint 2024 SRE Report - Five Key Takeaways

Only emerging into the mainstream in the 2010s, SRE is a relatively new discipline in tech. It’s been rapidly adopted by a widening variety of organizations, implementing constantly evolving practices. For the last six years, Catchpoint has been running a survey to take the temperature of the latest developments and trends. Check out the full report here, and read on to see our analysis on five key takeaways.

A Little Resilience Goes A Long Way

‍ Let’s call this the mother of all understatements. If you’re reading this blog, there’s a good chance that you: ‍ a.) Agree wholeheartedly with this sentiment and think it should go without saying, AND… b.) Are surrounded by folks who pay lip service to this idea while not taking it as seriously as they should.

Learn the Incident Response Life Cycle - Best Practices and Strategies

No company plans for a security breach, major outage, or other cyber incident, but they happen. When an incident occurs, having a standardized, regulated method of managing the fallout is critical. This is where the incident response life cycle comes in ‍

Weathering Black Friday and Other Storms Reliably

If you work in eCommerce, you can see the storm on the horizon. Black Friday, the biggest shopping day of the year both online and off, is only a few days away. Your services are going to hit usage spikes you possibly have never seen before. And it will be all aspects of your services pushed to your limit – people won’t just be searching, or just buying, or signing up for programs, they’ll be doing all of these at once. ‍ Most crucially, everyone else is offering deals too.

Security - A Pillar of Reliability

When you think about making your service reliable, what standards and benchmarks are most important? The availability of services? Consistently fast responses? Accurate data? Prioritizing critical and common use cases? These are all important and deserve some focus, but today we’ll put the spotlight on an often overlooked pillar: security. ‍ Cybersecurity incidents can be the most devastating types of incident for your organization.

The New SEC Rules and You

The Securities and Exchanges Commission published new rules for SEC registrants around disclosing incident details and response policies. Compliance with these new rules should be top of mind for any company – even if your org hasn’t hit the milestone of registering with the SEC, you should be prepared to be compliant when you take that step. ‍

Why Invest in Tooling? Benefits and Concerns

When looking to invest money in your engineering teams, what gives the best return? Hiring more staff to enable bigger projects and more diversified skill sets? Training engineers to uplevel their ability and productivity? Increasing salaries to retain the best talent? These are all great ideas that should be exercised often. But there’s one other investment worth considering that can offer huge benefits for relatively small amounts of money: tooling.