Tests, Data, & Dependencies: Defeating the Triple Threat of Software Testing

By Nate Lee

Jul 10, 2023

6 minutes

Speedscale

TL;DR

Reliable software testing requires that tests, data, and dependencies all work consistently and accurately represent reality. But with the explosion of APIs, data security concerns, and need to move fast, the task is easier said than done. Proactive software testing needs a new approach that orchestrates these three elements together in a unified manner. By recording production traffic, Speedscale provides a self-service way for engineers to autogenerate these assets without scripting, and test early and often. Speedscale captures transactions and scenarios (data), essentially “listening” to your app and understanding how your service is used by your customers. By isolating your API and varying the inputs (tests) and dependencies (backends), you can systematically control variables and apply the scientific method to test your code before release.

—

During my 10 years in the DevOps tools and digital transformation consulting space, only a handful of companies stood out for truly streamlined software delivery methodology. The commonality between these companies is that they’ve invested a lot of time and energy (2 years or more!) to develop automation, SOP, and self-service to making sure these three critical components are working together in unison:

Your tests
Your data
Your dependencies or environments

At Speedscale, we often refer to these elements as the Triple Threat.

But in today’s complex, API-driven world, the task to manage these elements together is a lot easier said than done, which is why many companies get discouraged partway through addressing the Triple Threat.

It’s no wonder setting up reliable tests routinely gets ignored in favor of developing new features…

tests & data

So, why is it so hard? Let’s dig in.

The Tests

Companies historically attack test automation first, pouring millions of dollars into automation frameworks and defect management suites, hoping to fulfill the promise of tests that you write once and use anywhere. It’s the most obvious step if you’re trying to increase automation.

tests data

But here’s the problem: Tests aren’t truly reusable because they are usually built around the UI layer, and the UI is the part of your application that changes the most. Scripted tests also take longer to write if the tests are dynamic (not hardcoded). So in the interest of time, most tests are usually pretty brittle. Secondly, there are never enough environments to run these tests in. After a while, testers get so burned out and jaded that anything in the test tools space might as well cease to exist.

(That’s why we at Speedscale decided to focus on systems beneath the UI layer, but we’ll get to that later.)

The Data

Data is a critical part of understanding what your service will do in production, as it contains the context and use case that determines which branch of code is utilized. Without proper test data, verifying the quality of your next build is often pointless. For example, if you’re a developer at Delta Airlines building a new feature for Gold Medallion members, but you only have data for Silver Medallion members to test with, how can you be sure your feature will work with the audience it’s intended for?

There are fewer companies addressing the data problem, since security can bring the ax down on that quite quickly. But the processes and technologies to manage data (e.g. data scrubbing, virtualization, and ETL) are alive and well (which Speedscale includes as part of the traffic replay capability). Speedscale is one of the first solutions to leverage actual data for testing — with the added capability of sanitizing and transforming it to make it replayable.

The Environments And Dependencies

Of the three testing elements, I’ve seen the most mature usage guidelines, processes, and general understanding around the use of environments and dependencies. Servers and VM’s are oftentimes a fixed CapEx that is closely monitored, and bean counters have developed budgets that dictate how available they are to engineering groups with chargebacks. But with cloud, containers and functions — that’s all changing. With containers standardizing the landscape, consistently simulating environments for everyone is within reach.

Still, standardizing cloud infrastructure in a self-serve, repeatable and scalable way is much more difficult than it sounds. While containerized cloud sells the promise of being on only when you need it, many platform engineering teams are surprised at the complexity they must navigate. Therefore it’s not surprising to see when it comes to development and testing, around 44% of enterprise cloud environments are long-lived pre-prod instances eating up the bill.

Orchestrating tests, data, and dependencies in unison

Orchestrating the tests, data, and dependencies in unison is the secret to streamlined software delivery. Consider these examples:

If you have folders upon folders of test scripts, but the proper backend systems aren’t ready to run them, they’re unusable.
If you have plenty of cloud capacity or brilliant IaC automation, but you rely on an offshore consultancy to do manual testing, you’re similarly beholden.
Perhaps you have sufficient test scripts and environments, but ETL processes to replicate, scrub and stage test data takes a long time between data refreshes.
Or maybe you’re not even sure which data scenarios to run.
Or worse still, you can only test the happy path because the applications rarely misbehave in the lower environments, versus how they error out in production. This can lead to a false sense of security.

The “fail fast” approach of yesterday

Amidst mounting pressure to release new features rapidly and stay ahead of competitors, most companies test in production and rely on monitoring tools to tell them what’s broken. By focusing all their effort on making rollouts and rollbacks as fast as possible, they can now patch or pull out builds at the first sign of trouble. Therein lies the dilemma.

data dependencies

That’s why, amidst mounting pressure to release new features rapidly and stay ahead of competitors, most companies test in production and rely on monitoring tools to tell them what’s broken.

Fast rollouts and rollbacks were popularized by the Netflixes, Facebooks, and Ubers of the world, who have a huge pool of users they can test their updates on. If a few folks have a bad experience, they’re none the worse for wear (the unicorns have multi-millions of users).

However, certain industries like fintech, insurance, and retail, cannot risk customers having transactions fail. They’re heavily regulated, or have critical processes, or have razor thin margins where every visitor must generate revenue.

Orchestrating the tests, data, and dependencies in unison is the secret to streamlined software delivery.

Photo Credit

We can’t do arm workouts every day and expect to gain leg muscles.

We seem to think the faster we release/rollback, or the better we can do canary deploys, or the more intelligent we are about which aspects of the code we release first, the more stable and robust our software will become.

But we can’t drive southbound and expect to end up in Canada.

We can’t do arm workouts every day and expect to gain leg muscles.

We can’t have an unhealthy diet every day and expect to look like Thor (the in-shape one, not the drunk one).

You get where I’m going with this? No amount of rollbacks, canary deploys, or blue/green actually improves our chances of code working right out of the gate, with minimal disruption for every release.

We’re working on the wrong muscle.

According to the DORA State of DevOps Report 2019, the failure rate of releases between Elite, High, and performers were all the same, up to 15%.

In fact, according to the DORA State of DevOps Report 2019, the failure rate of releases between Elite, High, and performers were all the same, up to 15%. Despite being categorized as an ‘Elite performer’, the average failure rate was the same as ‘Medium performers’. The classification was largely based on how often they release and how quickly they were able to react to issues. I’m not downplaying that capability, however we can’t expect production outages and defects to decrease if we’re only ever concerned with addressing how quickly we react.

Software quality has to be proactive.

The solution for proactive API software testing

Before I introduce Speedscale, let me share a testing scenario from a different industry:

In electrical engineering, chipsets are put into test harnesses to verify functionality independent of other components. Engines are put on machines called dynamometers (pictured below) to confirm power output before installation into vehicles.

gym

Translation: You need to test in a simulated environment before you put everything together.

Consider the aviation industry: Plane manufacturers have logged hundreds of hours on the engines, modeled the wings in a wind tunnel, and tested the software in a simulator way before taking off for the first time.

Software is one of the few industries where we put everything together and turn it on and hope it works. 🤞 😬 🤞

Speedscale was founded to bring a more proactive and automated approach to ensuring software quality.

So, how exactly do we do that?

By recording production traffic, Speedscale captures transactions and scenarios (data), essentially “listening” to your app and understanding how your service is used by your customers. By isolating your API and varying the inputs (tests) and dependencies (backends), you can systematically control variables and apply the scientific method to test your code before release.

Speedscale was founded to bring a more proactive and automated approach to ensuring software quality.

The key takeaways:

❌ DON’T try to guess how users will use your app and script tests to simulate it.

✅ DO examine real traffic and use it to auto-generate your tests.

❌ DON’T rely solely on huge, cumbersome end-to-end environments.

✅ DO auto-identify necessary backends for your SUT (system-under-test) and automatically generate mocks to simulate the behavior, modeled from real user traffic.

❌ DON’T manual test every release and expect to keep up

✅ DO run traffic replays as part of your automated CICD, and validate regression, functional, and performance behavior every code commit and build.

Try it out yourself free for 30 days by signing up here or join our Slack Community to get all your burning questions answered directly from our team!